# Guided Project_Exploring Hacker News Posts

Hi,

I am sharing my project - Guided Project_Exploring Hacker News Posts - for review, please kindly check. Actually I had completed this project in Nov 2019, I am posting it right now just to receive your valuable feedback.
Guided Project_ Exploring Hacker News Posts.ipynb (183.6 KB)

Click here to view the jupyter notebook file in a new tab

2 Likes

Thanks for sharing your project, Nisrin! The code looks great and I like that you went beyond the project to try to find the answer to the timezone question. Iâ€™ve never seen the library you used and it inspired me to go check it out.

Your introduction introduces the questions that youâ€™re hoping to answer from the dataset. Youâ€™ll want to revisit that with a conclusion at the end that brings everything all together. Also, donâ€™t be afraid to use Markdown cells to explain what youâ€™re doing. When you prepare the project for presentation, youâ€™ll probably want to edit out the additional print statements in the loop (cell 11 with the counts and comments by hour dictionaries).

Keep up the good work!

3 Likes

Hi,

I am learning so I too didnâ€™t knew about pytz, I learnt on Google. I am not sure that the conversion output is right. If you check-out then please do give a feedback about this code and its output. Because I calculated and found it should be 12:30am IST time and I am getting 1:49am IST.

Thanks once more, I have noted all the valuable guidance provided, and I will implement it in all my future projects.

Okay, so I spent some time playing around with this. For some reason, thereâ€™s some weird issue that revolves around the default date of 1900-01-01 when we convert the hour in sorted_swap to a datetime object.

dt.datetime.strptime('15', '%H')
#output:
datetime.datetime(1900, 1, 1, 15, 0)

We donâ€™t see the date though because we go ahead and use .strftime('%H:%M') to get the hours/minute.

When we use the object with localize() in the est_time line, it has a weird problem with the minutes in the timezone (I tried other time zones too and it was the same problem).

print(timezone('US/Eastern').localize(dt.datetime.strptime('15', '%H')))
# output:
1900-01-01 15:00:00-04:56   # should say 15:00:00-05:00

Notice the difference if we run the same code but put in a specific date with the time 15:

print(timezone('US/Eastern').localize(dt.datetime(2000, 1, 1, 15)))
# output
2000-01-01 15:00:00-05:00

So what I did to try to fix the issue was create the datetime object with a different year by putting in a date and converting lst to an integer (because thatâ€™s what we need to use datetime()). I tried with the year 1950 and it worked okay, so itâ€™s probably something buggy with the year 1900.

Here is the code I used with the workaround described. It gave the expected output (the time is an hour different, 01:30 IST vs 00:30 IST, because itâ€™s using Eastern Standard Time and not Eastern Daylight Time).

for lst in sorted_swap[:5]:
hour = dt.datetime(2000, 1, 1, int(lst[1]))         #any date is fine, it doesn't matter since we don't use it
comment_avg = lst[0]
print(outputsentence.format(hr = hour.strftime('%H:%M'), avg = comment_avg))
est_time = timezone('US/Eastern').localize(hour)
ist_time = est_time.astimezone(timezone('Asia/Calcutta')).strftime('%H:%M')
print(outputsentencea.format(hr = ist_time, avg = comment_avg))

Output:

15:00 in Eastern Time in the US: 38.59 average comments per post
01:30 in IST: 38.59 average comments per post
02:00 in Eastern Time in the US: 23.81 average comments per post
12:30 in IST: 23.81 average comments per post
20:00 in Eastern Time in the US: 21.52 average comments per post
06:30 in IST: 21.52 average comments per post
16:00 in Eastern Time in the US: 16.80 average comments per post
02:30 in IST: 16.80 average comments per post
21:00 in Eastern Time in the US: 16.01 average comments per post
07:30 in IST: 16.01 average comments per post