Data Cleaning Basics-encoding meaning

HI, can you help to explain what does the encoding=‘Latin-1’ mean in the pandas.read_csv()?

Thank you!

pandas.read_csv accepts encoding option to support different standard formats.

larin_1 is for Western Europe.
You can find here more formats.

1 Like

@candiceliu93 Hey,

We are telling pandas that encoding type is “Latin-1”. Mostly, we use encoding type UTF-8.

Encoding type is just the way computer understands our language. As you know, for computer everything is ones and zeroes.


Thank you! so based on the language of the dataset to select the encoding type? have i understood it correct?

1 Like

Thank you for sharing the resource! helpful!

@candiceliu93 Yes, but there are also multiple encoding types available for one language, so it also depends on what encoding type sender/source has used.

If I get a dataset, i can find the encoding type online or i have to find out from the data source?