350-2 Error executing a line of code in the guided project of Python for Data Science: Fundamentals

Hello Guys,

Please i get an error running this command on Jupyter Notebook on my local computer,
The data file is save in my working directory. when i ran single line of the code in seperate cell i noticed the error was at this line of code

android = list(read_file)

Kindly assist in debugging it as I like to also working on the guided project on my local machine.

from csv import reader

### The Google Play data set ###
opened_file = open('googleplaystore.csv')
read_file = reader(opened_file)
android = list(read_file)
android_header = android[0]
android = android[1:]
UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-7-691158540a82> in <module>
      4 opened_file = open('googleplaystore.csv')
      5 read_file = reader(opened_file)
----> 6 android = list(read_file)
      7 android_header = android[0]
      8 android = android[1:]

C:\ProgramData\Anaconda3\lib\encodings\cp1252.py in decode(self, input, final)
     21 class IncrementalDecoder(codecs.IncrementalDecoder):
     22     def decode(self, input, final=False):
---> 23         return codecs.charmap_decode(input,self.errors,decoding_table)[0]
     25 class StreamWriter(Codec,codecs.StreamWriter):

UnicodeDecodeError: 'charmap' codec can't decode byte 0x90 in position 3077: character maps to <undefined>

The file in question is not using the CP1252 encoding. It’s using another encoding. For instance, UTF-8 is very common. Byte 0x90 looks more like UTF-8 encoding, where 0x90 is a continuation byte.

You may specify the encoding when you open the file:

opened_file = open('googleplaystore.csv', encoding='utf8')

Wow… and it works just like magic
Thanks Ranklord

Glad it helped!
If you don’t mind, could you, please, mark the answer as the solution?
Thanks in advance!

This isn’t an answer to the question, but rather a helpful (I hope) observation.

The open function has parameters that we don’t explore at the beginning of the path. Some of these parameters have default values that are basically region dependent. One of them is the encoding parameter.

Wherever your computer is from (it’s about the operating system, actually), it’s very likely that the default encoding isn’t UTF-8, hence the error. You can override the default value by doing as ranklord suggests.


Thanks Bruno for the explanation

1 Like

This solution is not working for me.Getting issue at converting from read file to list.

Could you please help me on this.

Note:I am getting this issue only when I am using in function call,Thanks in advance

def open_dataset(file_name=‘AppleStore.csv’, header=True,encoding =‘utf-8’):
opened_file = open(file_name)
from csv import reader
read_file = reader(opened_file)
data = list(read_file)
Applestore_header = data[0]
Applestore = data[1:]

if header:
    return Applestore_header
    return Applestore

File_open = open_dataset(header=False)

You are including encoding as an argument of the open_dataset function that you are writing. My hunch is that if you inset the encoding into the open () function, this code should work.
Hope this helps.