Why does the range look like this?

Screen Link:

https://app.dataquest.io/m/523/pandas-visualizations-and-grid-charts/8/how-traffic-slowness-change

In the code below, I do not understand a few things:

  1. Why is the range’s upper limit 135?
  2. Why does the range increment by 27?
  3. What does each_day_traffic = traffic[i:i+27] do? This is very confusing.

While I am all for figuring out some of this on your own, this is particularly un-intuititive, and could do with some explanation within the mission,

days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday']
traffic_per_day = {}
for i, day in zip(range(0, 135, 27), days):
    each_day_traffic = traffic[i:i+27]
    traffic_per_day[day] = each_day_traffic

4 Likes

The number of records/rows in traffic.

traffic.shape
(135, 18)

As per Screen 2

The data was registered from 7:00 to 20:00 every 30 minutes. The Hour (Coded) column has values from 1 to 27

So, for each day, Monday to Friday, we have 27 records.

In the for loop we are trying segregate the traffic data for each weekday.
For the first iteration, i=0, day=Monday and each_day_traffic will have the first 27 records, i.e. traffic[0:27].
For second iteration, i=27, day=Tuesday and each_day_traffic will have the next 27 records, i.e. traffic[27: 54].
For third iteration, i=54, day=Wednesday and each_day_traffic will have the next 27 records, i.e. traffic[54:80], and so on.

Hope it’s clear now.

18 Likes

Thank you, this was very helpful

1 Like

Thank you so much, this has cleared the confusion.

1 Like

BIG time shoutout to you @dash.debasmita… thank you for breaking this down. The mission did not explain this at all and I refused to continue on without understanding :grinning:

2 Likes

Makes sense but if that is the case why in the graphing part syntax: below does every day have the x axis start at 0. Wouldnt Tuesday be 27-54? Its better that way but I am trying to understand how if we are using the above for loop to splice days by corresponding rows based on time splicing how does the plotting of them in the for loop values in one not apply to the next one.

Said another way when you graph using the key value pair of traffic_per_day[day] its taking every 30 min or 0 through 27 and then its value of Y which is % of traffic - is it because on each for loop it just counts the inputs so its 27 regardless of the range of what 27 so Tuesday 27 -54 is treated as an absolute count not relative to its row position? If this is the case can you elaborate this is always done in for loops sometimes done or specific syntax needed. Thanks!

traffic_per_day[day].plot.line(x='Hour (Coded)',
                               y='Slowness in traffic (%)')
plt.title(day)
plt.ylim([0, 25])
plt.show()