350-x Moving my Guided Project outside DQ Platform

Hello Guys,

I want to move my guided project code into Jupyter notebook, outside DQ platform. I am doing it manually line by line below is the first code i copies to JupyterNotebook but I’m getting an error already in the fourth line. I already run this code within the DQ platform without any error.

Please, what could be the problem.

opened_file = open('AppleStore.csv')
from csv import reader
read_file = reader(opened_file)
apple_Apps = list(read_file)
apple_Appsheader = apple_Apps[0]
apple_Apps = apple_Apps[1:]

UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-17-9e88785ad623> in <module>
      2 from csv import reader
      3 read_file = reader(opened_file)
----> 4 apple_Apps = list(read_file)
      5 apple_Appsheader = apple_Apps[0]
      6 apple_Apps = apple_Apps[1:]

~\AppData\Local\Continuum\anaconda3\lib\encodings\cp1252.py in decode(self, input, final)
     21 class IncrementalDecoder(codecs.IncrementalDecoder):
     22     def decode(self, input, final=False):
---> 23         return codecs.charmap_decode(input,self.errors,decoding_table)[0]
     25 class StreamWriter(Codec,codecs.StreamWriter):

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 2693: character maps to <undefined>


Let’s see how to solve this. I would copy the entire UnicodeDecodeError into google and open the 1st result. https://stackoverflow.com/questions/9233027/unicodedecodeerror-charmap-codec-cant-decode-byte-x-in-position-y-character

The file in question is not using the CP1252 encoding. It's using another encoding. Which one you have to figure out yourself. Common ones are Latin-1 and UTF-8. Since 0x90 doesn't actually mean anything in Latin-1, UTF-8 (where 0x90 is a continuation byte) is more likely.

You specify the encoding when you open the file:

From the error showing ~\AppData\Local\Continuum\anaconda3\lib\encodings\cp1252.py it shows that this error occured in the cp1252 file. This also hints that your computer is trying to decode it as cp1252. You can open that folder to see what are other possible encodings installed in your system.

From the python docs: https://docs.python.org/3.7/library/functions.html#open

The default encoding is platform dependent (whatever locale.getpreferredencoding() returns), but any text encoding supported by Python can be used. See the codecs module for the list of supported encodings.

So you can expect your default is cp1252. One common solution is in the stackoverflow page above.

1 Like

Great, Thnaks for solution