```{admonition} Information
__Section__: The Pandas library  
__Goal__: Get a first idea of the basics of the Pandas library in Python.  
__Time needed__: 10 min  
__Prerequisites__: Curiosity
```

# The Pandas library

The ``pandas`` library provides data structures and data analysis tools in Python. We will mainly use it for our machine learning experiment.

All the functions, operations, etc. used in the exercises are explained in the exercises themselves. This Notebook is just a small introduction on how the library looks like.

The official documentation of pandas is available here: https://pandas.pydata.org/pandas-docs/stable/index.html. If ever in doubt on how to use a function or which function to use, you can always refer to the documentation.

## 1 Import the library <a class="anchor" id="t1"></a>

Before using it, we need to import the library. Usually, it is done as follow:

In [None]:
import pandas as pd

We can now access all the ``pandas`` tools with the prefixe ``pd``.

## 2 The DataFrame type <a class="anchor" id="t2"></a>

In the exercises, we will store and use the datasets using the ``DataFrame`` type. A dataframe is a two-dimensional data structure with labeled rows and columns. We will refer to the columns as ``attributes`` and to the rows as ``indexes``.

In [None]:
import numpy as np # we use the numpy library to create a fake dataframe of random numbers

df = pd.DataFrame(np.random.randn(6, 4), columns=list('ABCD'))
df

We can do many operations on dataframes. Here is an example on how to do it. The operation ``info()`` returns basic information about the dataframe.

In [None]:
df.info()

## 3 Attributes <a class="anchor" id="t3"></a>

We can access a row of the dataframe (an attribute) with its name:

In [None]:
df['A']

It is possible to make operations on one attribute of the dataframe. For example, we change the type of an attribute with the operation ``astype()``.

In [None]:
# If you only want to print the columns 'A' as integer, do the following:
df['A'].astype('int')

In [None]:
df['A']

In [None]:
# If you want to change the type of the column 'A' in the 'df' dataframe, do the following:
df['A'] = df['A'].astype('int')
df['A']

We get the type of the attribute like this:

In [None]:
df['A'].dtype

## 4 Copy a dataframe <a class="anchor" id="t4"></a>

Sometimes, we want to copy the existing dataframe into a new one, to make operations on it without modifying the initial data. Use the function ``copy()`` to copy a dataframe.

In [None]:
df2 = df.copy()
df2

Now you can try anything on ``df2`` without the fear of losing the initial data, which are still in ``df``.

## 5 More insight

Here is a very helpful [cheat sheet](http://datacamp-community-prod.s3.amazonaws.com/dbed353d-2757-4617-8206-8767ab379ab3) about Pandas and Python that contains most of the tools we will need in this course, developped by [datacamp.com](https://www.datacamp.com/community/blog/python-pandas-cheat-sheet).