Information

Section: .csv files - data format used in the course
Goal: Understand the properties of the .csv files.
Time needed: 10 min
Prerequisites: Curiosity

.csv files - data format used in the course

The data is the central point of this course. In this section, we explain the .csv file format that is used in the course.

CSV means Comma Separated Value: the file contains data, each value being separated with a comma. This format is used to represent data as tables: one row for a new data point, each column for the value of an attribute.

This is how a .csv file looks like in a text editor:

text

The first line contains the header: the values are the ‘titles’ of the columns. Each following line contains a datapoint.

In a program able to read .csv files (for example, Excel, or here Python), the data look like this:

text

We can see on this example a dataset containing 3 attributes: Petallength, Petalwidth and class.