Guided Project: Exploring Hacker News Posts mission 4

Hello everyone. On mission 4 the Hacker News Project I’m calculating the average number of comments on ‘ask posts’ and ‘show posts’. I keep getting the “Show_posts” with an higher average of comments than "ask_ posts"for both posts. I know that can’t be right, with the next mission suggestion that ask_posts has an higher average. Please Help me spot out any mistakes I have done.

//separating the different post on Hacker News, with they’re lower case version
from csv import reader

import the module with an alias

import csv as hn
opened_file =open(‘hacker_news.csv’)
read_file = reader(opened_file)
hn = list(read_file)
hn_header = hn[0]

ask_posts =
show_posts =
other_posts =

for row in hn:
title = row[1]
title = title.lower()
if title.startswith(‘ask hn’):
ask_posts.append(row)
elif title.startswith(‘show hn’):
show_posts.append(row)
else:
other_posts.append(row)

print(len(ask_posts))
print(len(show_posts))
print(len(other_posts))

//Calculating the average Number of comments posted to ‘ask posts’
total_ask_comments = 0
for comments in ask_posts:
num_comments = row[4]
num_comments = int(num_comments)
total_ask_comments += num_comments
avg_ask_comments = total_ask_comments / len(ask_posts)

print(total_ask_comments)
print(avg_ask_comments)
print(len(ask_posts)

total_show_comments = 0
for comments in show_posts:
num_comments = row[4]
num_comments = int(num_comments)
total_show_comments += num_comments
avg_show_comments = total_ask_comments / len(show_posts)

print(total_show_comments)
print(avg_show_comments )
print(len(show_posts))

Hi @aaronkg23. It took me a while, but I finally spotted it. In both of the loops for calculating the average number of comments, your iteration variable is called comments. However, when you get the number of comments, you’re using row[4], which isn’t part of the loop (but maybe used in another loop previous and thus not throwing up an error?). You’ll need to change these to comments[4] so that your loop works out properly.

Once you get that sorted, this part is going to give you trouble:

You’re dividing the number of comments for ask posts by the number of show posts.

Hope that helps! That was a head-scratcher for sure!

1 Like

Ah ha! Thank you. It worked.
I also spotted my other mistake, in the show_post loop. That did it. should be total_show_post.
It’s funny how that happens sometimes. :laughing:

Hi,

How do I download the dataset hacker_news.csv. The link in the project points to kaggle dataset, but the dataset we are supposed to work on is supposed to be a reduced set. I am trying to work on this in my local jupyter notebook

Blockquote
You can find the data set here, but note that it has been reduced from almost 300,000 rows to approximately 20,000 rows by removing all submissions that did not receive any comments, and then randomly sampling from the remaining submissions.

Hi @satishguntur,

You can download the dataset from Guided Projects by clicking on the Download button:

Best,
Sahil

Thanks Much @Sahil. Thanks for pointing me to the kb article, I should have searched before posting. Will do from now on. Thanks again.

2 Likes