Guided Project: Exploring Hacker News Post <help>

Screen Link:
https://app.dataquest.io/m/356/guided-project%3A-exploring-hacker-news-posts/7/sorting-and-printing-values-from-a-list-of-lists

Hi everyone,
I am kind of stuck at this step of parsing and formatting hour in my project. The step is attached as a screenshot! IMG_20200502_094318|690x354

I don’t seem to get what I’m being asked to do in these steps:

  1. Loop through each average and each hour (in this order) in the first five lists of sorted_swap .
  2. Use the str.format() method to print the hour and average in the following format: 15:00: 38.59 average comments per post .
  • To format the hours, use the datetime.strptime() constructor to return a datetime object and then use the strftime() method to specify the format of the time.
  • To format the average, you can use {:.2f} to indicate that just two decimal places should be used.

Any help to nudge me in the right direction would be greatly appreciated. Thanks.

<> This is what I’ve been able to do <>


My result included date and second, while I think the instruction is to only include Hour.

2 Likes

Hello,
Can I see the line of the code that contains hour_avg_string ? It seems unclear without this line of code…

This is the whole cell of codes:

sorted_swap = sorted(swap_avg_by_hour, reverse=True)
print( "Top 5 Hours for Ask Posts Comments")

for row in sorted_swap[:5]:
avg = row[0]
hour = row[1]
hour_format = "%H"
hour = dt.datetime.strptime(hour, hour_format)

hour_avg_string = "{h}: {a:.2f} average comments per posts".format(h=hour, a=avg)
print(hour_avg_string)

Thank you.

You need to use strftime before printing the output. The strptime function parse the date string as per requirement.But strftime formats the date string.You can correct your code by adding:
hour=hour.strftime('%H:%M') before the line contaning hour_avg_string which will format the ‘hour’ as “hour: minute” .Hope ,this will work.

2 Likes

Thank you @moumitasen. I was able to solve it yesterday. Your help is greatly appreciated :hugs:

1 Like

hey @angeloluemmanuel

Please mark @moumitasen’s post(s) as solution if that has helped you. This might come handy to other students as well!

Thanks

Hello, I am getting invalid syntax when I try to print the hour and average in the format
'15:00: 38.59 average comments per post`. Could you please tell me what is wrong with my code below?
Basics (1).ipynb (11.6 KB)

Click here to view the jupyter notebook file in a new tab

Hello @vroomvroom,

You were getting the SyntaxError because you did not format the string in your print() statement properly. After correcting the syntax error, I also corrected some other bugs in the code cell. The resulting code block is this:

The code cell In[31] should be:

for row in sorted_swap[:5]:
    comments = row[0]
    hours = row[1]
    hours = dt.datetime.strptime(hours,"%H")
    hr = hours.strftime("%H:%M")
    print("{hr} {comments:.2f} average comments per post".format(hr=hr,comments=comments))

Thanks for the corrections. When I tried to run it with the changes, I got an error for hours = dt.datetime.strptime(hours,"%H")
The error stated TypeError: must be str, not int

@vroomvroom,

You are getting this error because the value in the hours variable is an integer but the datetime.strptime() class method creates a datetime object from a string representing a date and time and a corresponding format string.

In the code cell In[7], referring to this line of code hour = date_dt.hour by applying the .hour attribute on date_dt, it returns an integer value and assigns the integer value to the hour variable.

Therefore, one method to solve this is by converting the hours variable to string by using this line of code hours = str(row[1]) in :

for row in sorted_swap[:5]:
    comments = row[0]
    hours = str(row[1])

Another method to solving this is by obtaining the hours variable from the date_dt variable as a string by using the strpftime() method instead of the .hour instance attribute:

for row in result_list:
    date_str = row[0]
    date_dt = dt.datetime.strptime(date_str, "%m/%d/%Y %H:%M")
    hour = date_dt.strftime("%H")

I hope this helps.

1 Like

Yes, it did, thank you!

Hello,

I’m having issues with this step in converting the time to %H format. I’ve looked at several solutions in the community thread and have even followed solutions above, however when I convert the hour to a string format first before passing to the dt.datetime.strptime method, it gives me the error below. If I don’t convert the time value to a string, it gives me an error that it’s a tuple.

Note: I’m actually doing analysis on the Show HN lists of lists because on step 4 when we calculated the average posts, the Show HN was higher and decided to go that route.

I’ve attached my python notebook for reference.

Thanks a bunch!

Exploring Hacker News_cc.py (5.4 KB)

“”"
for row in sorted_swap[:5]:

avg_comment = row[0]

hr = str(row[1])



date_dt = dt.datetime.strptime(str(row[1]), "%H")

date_hr = date_dt.strftime("%H")



template = "{hr} : {comment:.2f} average comments per post"



post_time = template.format(hr=date_hr, comment=avg_comment)

ERROR MSG:

ValueErrorTraceback (most recent call last)
in ()
3 hr = str(row[1])
4
----> 5 date_dt = dt.datetime.strptime(str(row[1]), “%H”)
6 date_hr = date_dt.strftime("%H")
7

/usr/lib/python3.4/_strptime.py in _strptime_datetime(cls, data_string, format)
498 “”“Return a class cls instance based on the input string and the
499 format string.”""
–> 500 tt, fraction = _strptime(data_string, format)
501 tzname, gmtoff = tt[-2:]
502 args = tt[:6] + (fraction,)

/usr/lib/python3.4/_strptime.py in _strptime(data_string, format)
335 if not found:
336 raise ValueError("time data %r does not match format r"
–> 337 (data_string, format))
338 if len(data_string) != found.end():
339 raise ValueError("unconverted data remains: s"

ValueError: time data “(‘09’, 9.7)” does not match format ‘%H’

“”"

Hello @ccontan,

The mistake is in code cell In[59], the code block there should be:

for row in avg_by_hour:
    hr = row[0]
    avg = row[1]
    
    swap_avg_by_hour.append([avg,hr])

You assigned the wrong values to the hr and avg variables that is the reason you are getting a ValueError when it tries to convert it to the datetime format.

Let me know if this resolves the problem.

@ccontan, I just noticed that you concluded that Show HN posts garners more comments.

In In[28], you made a mistake while calculating the average comments for the Show HN posts, the correct line of code should be:

avg_show_comments = total_show_comments/len(show_posts).

After this correction, you should notice that Ask HN posts receive more comments.

Lastly, It’s advisable to open up a new post in cases where you are having an entirely different question to ask.

Happy learning! :blush:

Thank you! correction to cell In [59]did solve the errors. I also did notice my error on In [28] when I did my code review.

Thanks a bunch!

1 Like