LIMITED TIME OFFER: 50% OFF OF PREMIUM WITH OUR ANNUAL PLAN (THAT'S $294 IN SAVINGS).
GET OFFER

Guided project 2. Same code different results

<

Screen Link:
https://app.dataquest.io/m/356/guided-project%3A-exploring-hacker-news-posts/1/introduction

My Code:

f=open("HN_posts_year_to_Sep_26_2016.csv")
hn=list(csv.reader(f))
hn[:5] ```

What I expected to happen:
[[‘id’, ‘title’, ‘url’, ‘num_points’, ‘num_comments’, ‘author’, ‘created_at’],
[‘12224879’,
‘Interactive Dynamic Video’,
http://www.interactivedynamicvideo.com/’,
‘386’,
‘52’,
‘ne0phyte’,
‘8/4/2016 11:52’],
[‘10975351’,
‘How to Use Open Source and Shut the ■■■■ Up at the Same Time’,
http://hueniverse.com/2016/01/26/how-to-use-open-source-and-shut-the-■■■■-up-at-the-same-time/’,
‘39’,
‘10’,
‘josep2’,
‘1/26/2016 19:30’],
[‘11964716’,
“Florida DJs May Face Felony for April Fools’ Water Joke”,
http://www.thewire.com/entertainment/2013/04/florida-djs-april-fools-water-joke/63798/’,
‘2’,
‘1’,
‘vezycash’,
‘6/23/2016 22:20’],
[‘11919867’,
‘Technology ventures: From Idea to Enterprise’,
https://www.amazon.com/Technology-Ventures-Enterprise-Thomas-Byers/dp/0073523429’,
‘3’,
‘1’,
‘hswarna’,
‘6/17/2016 0:01’]]

What actually happened:

UnicodeDecodeError                        Traceback (most recent call last)
<ipython-input-7-7fb64584d4eb> in <module>
      2 import csv
      3 f=open("HN_posts_year_to_Sep_26_2016.csv")
----> 4 hn=list(csv.reader(f))
      5 hn[:5] ```

~\anaconda3\Anaconda 3.1\lib\encodings\cp1252.py in decode(self, input, final)
     21 class IncrementalDecoder(codecs.IncrementalDecoder):
     22     def decode(self, input, final=False):
---> 23         return codecs.charmap_decode(input,self.errors,decoding_table)[0]
     24 
     25 class StreamWriter(Codec,codecs.StreamWriter):

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 6556: character maps to <undefined>

I down loaded the csv file from the link provided and saved it on my computer as I want to work locally with Jupyter. Some how i don’t get the results i expect. The code is exactly the same to the one provided in the solution but the csv names differ.

This might be an issue related to how the data was encoded. This is something you will learn about in a future Mission.

For now, try to use one of the following and see which one works -

f=open("HN_posts_year_to_Sep_26_2016.csv", encoding='utf-8')

for the above, if it throws a different error then maybe encoding='utf8' might work (no -)

or

f=open("HN_posts_year_to_Sep_26_2016.csv", encoding="Latin-1")

or

f=open("HN_posts_year_to_Sep_26_2016.csv", encoding='cp437')

One of these should ideally work.

NOTE: The dataset that is used in the Mission itself is different than the one you downloaded. DataQuest has cleaned the data and used a subset of it for the Guided Project. So, if you plan to work on your own system and download the data from the actual source, you will notice and have to deal with some differences.

1 Like

The first alternative worked. I didnt try the other two. Its good to know actually that there is a difference in both datasets otherwise for someone trying to work locally would get different results as compared to the mission results.

Thank you

1 Like