Guided Project: Extracting Ask HN and Show HN Posts

Screen Link: https://app.dataquest.io/m/356/guided-project%3A-exploring-hacker-news-posts/3/extracting-ask-hn-and-show-hn-posts

Hi,

I’m not sure why I keep getting different number of posts than the answer. I downloaded the csv from here: https://www.kaggle.com/hacker-news/hacker-news-posts

My Code:

from csv import reader
opened_file = open('hacker_news.csv')
read_file = reader(opened_file)
hn = list(read_file)
hn[:5]

headers = hn[0]
hn = hn[1:]
print(headers)
print(hn[:5])

ask_posts = []
show_posts = []
other_posts = []

for post in hn:
    title = post[1]
    if title.lower().startswith("ask hn"):
        ask_posts.append(post)
    elif title.lower().startswith("show hn"):
        show_posts.append(post)
    else:
        other_posts.append(post)

print("Number of asks posts",len(ask_posts))
print("Number of show posts", len(show_posts))
print("Number of other posts", len(other_posts))

What I expected to happen:
Number of asks posts 1744
Number of show posts 1162
Number of other posts 17194

What actually happened:
Number of asks posts 9139
Number of show posts 10158
Number of other posts 273822

Any help much appreciated!

Hello @meganylin, welcome to the community!

You most likely downloaded a different dataset using that kaggle link. Check this post on the best way to download the exact dataset used in the guided projects.

Happy learning!

1 Like

Ohhhh, thank you Doyinsolami!

1 Like