How can I influence the sequence of a for loop (and how should I know?)

Hello,

Doing guided project ‘Exploring Hacker News Posts’ and have a question.

Screen Link:
https://app.dataquest.io/m/356/guided-project%3A-exploring-hacker-news-posts/6/calculating-the-average-number-of-comments-for-ask-hn-posts-by-hour

My Code:

# Create a table that contains the hours of day and the average number of posts
avg_by_hour = []
for hour in counts_by_hour:
    num_posts = counts_by_hour[hour]
    num_comments = comments_by_hour[hour]
    average = num_comments / num_posts
    avg_by_hour.append([hour, average])

# Print the result
output = "For hour {} the average number of comments per post is {:.1f}"
for row in avg_by_hour:
    print (output.format(row[0], row[1]))  

That works as such, as I get this result:

For hour 9 the average number of comments per post is 5.6
For hour 13 the average number of comments per post is 14.7
For hour 10 the average number of comments per post is 13.4
For hour 14 the average number of comments per post is 13.2
For hour 16 the average number of comments per post is 16.8
For hour 23 the average number of comments per post is 8.0
For hour 12 the average number of comments per post is 9.4
For hour 17 the average number of comments per post is 11.5
For hour 15 the average number of comments per post is 38.6
For hour 21 the average number of comments per post is 16.0
For hour 20 the average number of comments per post is 21.5
For hour 2 the average number of comments per post is 23.8
For hour 18 the average number of comments per post is 13.2
For hour 3 the average number of comments per post is 7.8
For hour 5 the average number of comments per post is 10.1
For hour 19 the average number of comments per post is 10.8
For hour 1 the average number of comments per post is 11.4
For hour 22 the average number of comments per post is 6.7
For hour 8 the average number of comments per post is 10.2
For hour 4 the average number of comments per post is 7.2
For hour 0 the average number of comments per post is 8.1
For hour 6 the average number of comments per post is 9.0
For hour 7 the average number of comments per post is 7.9
For hour 11 the average number of comments per post is 11.1

Now there is two things that I would like to change though:

  • The sequence seems random. I would rather get this in the sequence hour 0, hour 1, hour 2 etc. I don’t know how to impact the sequence of a for-loop though. (Or if that’s not how to do this, how to do this then.)
  • (less important, but still) For readability, I would want to pad with leading zeroes, so e.g. ‘hour 08’, hour 09’, then followed by ‘hour 10’

For neither I was sure how to do this, and could not easily find it.

So my question is two-fold actually:

  1. Do you know solutions for these two things?

  2. More in general, what do you see as an efficient way of figuring out such things?
    Just type what you are trying to achieve in a google search bar?
    Start searching in the official documentation https://docs.python.org/3/
    …?

1 Like

Hey Jasper.

The original output was random because the list was created by whatever order the counts_by_hour dictionary happened to be in when you ran the cells. Since avg_by_hour is a list, you can have it sorted after it runs first loop with avg_by_hour.sort(). By default it will sort by the first element in each pair (the hour in this case) ascending. Then when you run the 2nd loop, it will print out in the order desired. [Here’s more info about sorting in Python if you’re interested.]

As far as removing the padding, I wasn’t sure how to do it with .format() because hour is already a string, but one thing that works is to convert hour to an integer when you’re appending to the avg_by_hour list.

avg_by_hour.append([int(hour), average])

Then when it formats the output string, by default it will display the hour without any padding.

As far as your general question, I tend to type what I want to do in the search bar. Make sure to include “python” in your search. For the string formatting question, I typed “python format string padding”. You get better at searching and knowing which resources are helpful over time. Searching the documentation I think is more useful when you are looking up the uses for a specific method. (The documentation often pops up somewhere in the search results when you use “python” as part of the query anyway.)

I hope that helps.

Hi @april.g,

Thank you for the reply - that helps.

For the sorting, got it to work with .sort() indeed.

For the padding, the hours were integers actually, and for printing I wanted to pad them with a zero so all hours would had the same length. I found the solution for that in the meantime actually (while applying your search-suggestion). For the record, let me paste it here:

# Print the result
output = "For hour {:02d} the average number of comments per post is {:.1f}"
for row in avg_by_hour:
    print (output.format(row[0], row[1]))  

So adding {:02d} did the trick!

1 Like

Ah ok I misunderstood, I thought you didn’t want them padded (they were padded when I printed them, so my hour keys were probably strings already and yours weren’t). Great job finding the solution, and thanks for sharing it in case someone has the same question! :star_struck:

For padding you can also use str.zfill. See the last part of this post for more details on it.

1 Like