Distribution time. Doubt on the highway I_94

Hi !

I did this exercise a time ago and was very happy until I decided to upload it to Github and I figured as a potential client. :face_with_monocle:

The situation is as follows:

Once I import what I need to work with, I set out to separate day and night.

traffic['date_time'] = pd.to_datetime(traffic['date_time']).copy() # text to datetime
traffic['date_time']

0       2012-10-02 09:00:00
1       2012-10-02 10:00:00
2       2012-10-02 11:00:00
3       2012-10-02 12:00:00
4       2012-10-02 13:00:00
                ...        
48199   2018-09-30 19:00:00
48200   2018-09-30 20:00:00
48201   2018-09-30 21:00:00
48202   2018-09-30 22:00:00
48203   2018-09-30 23:00:00
Name: date_time, Length: 48204, dtype: datetime64[ns] <- Ok

These are the hours in a day

traffic['date_time'].dt.hour.unique()
array([ 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,  0,  1,
        2,  3,  4,  5,  6,  8,  7])

I display the history based on volume and hours.

plt.hist(horas_alles)
plt.title('Raw Traffic Volume (whole data set)')
plt.xticks(ticks=[ 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
                  19, 20, 21, 22, 23, 0, 1, 2, 3, 4, 5, 6, 8, 7],
           labels=[ '09:00','10:00','11:00','12:00','13:00','14:00',
                   '15:00','16:00','17:00','18:00','19:00','20:00',
                   '21:00','22:00','23:00','24:00','01:00','02:00',
                   '03:00','04:00','05:00','06:00','08:00','07:00'],
           rotation=85)
plt.show()

distribu

The hours seem to coincide with the volume of vehicles.

But if I do the same separating the hours of the day and night in the graph of the night, apart from the fact that I do not know at what time it starts, it strikes me that the volume reaches up to 6000, which in my opinion is quite strange.

night_bool = (traffic['date_time'].dt.hour >=19) | (traffic['date_time'].dt.hour <=7)

horas_nighttime = traffic.loc[night_bool,'date_time'].dt.hour
horas_nighttime.unique()

array([19, 20, 21, 22, 23,  0,  1,  2,  3,  4,  5,  6,  7])

day_bool = (traffic['date_time'].dt.hour >=7) & (traffic['date_time'].dt.hour <=19)

horas_daytime = traffic.loc[day_bool,'date_time'].dt.hour
horas_daytime.unique()

array([ 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,  8,  7])



plt.figure(figsize = (15,4))

plt.subplot(1,2,1)

plt.hist(horas_daytime)
plt.xlabel('Day hours from 7 AM to 7 PM ')
plt.ylabel('Cars per hour')
plt.title('Distribution Traffic Volume: Day')
plt.xticks(ticks=[ 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,  8,  7],
           labels=[ '09:00','10:00','11:00','12:00','13:00','14:00',
                   '15:00','16:00','17:00','18:00','19:00','08:00','07:00'
                  ],
           rotation=85)
plt.subplot(1,2,2)

plt.hist(horas_nighttime)
plt.xlabel('Night hours from 7 PM to 7 AM')
plt.ylabel('Cars per hour')
plt.title('Distribution Traffic Volume: Night')
plt.xticks(ticks=[ 19, 20, 21, 22, 23,  0,  1,  2,  3,  4,  5,  6,  7 ],
           labels=['19:00', '20:00', '21:00', '22:00', '23:00', '24:00','01:00','02:00','03:00','04:00',
                   '05:00','06:00','07:00' ],
           rotation=85)
plt.show()

I don’t think this is correct but I can’t find a way to adjust the graph with some kind of “offset” so that the night time schedules would start for example when the day time schedules end.

I would like to see if someone can give a push to my car in the middle of the highway. :grinning_face_with_smiling_eyes:

Thanks again.

A&E.

Hi @Edelberth,

This was not part of your question but please note that you are plotting the frequency of data points for each hour, not the traffic volume. horas_alles, horas_daytime, etc. do not contain traffic_volume information from what I can tell.

If you did want to plot the # of data points per hour, I believe it would be recommended to use plot.bar() instead of hist() since your x axis, hours, is made up of discrete values.

Your question about why your 2nd plot doesn’t start at 7pm, I am quite sure would have to do with sorting issues of horas_nighttime.

You can’t use the default numeric sorting which would indicate your data has a gap from 7-19.

Unfortunately I am not expert enough to tell you how to resort it with this key order: [20, 21, 22, 23, 0, 1, 2, 3, 4, 5, 6, 7]. This page may or may not help - it at least illustrates how it’s not straightforward!

Hope some of this is of use!

Best of luck,
kwu

Thanks anyway, any idea that you can’t think of is always good.

I take note of what you tell me.

Thank you very much.

A&E

cheers, @Edelberth :grinning:

people here are super-helpful, i expect you’ll get more insight into your issue!