Guided Project (Heavy Traffic Indicators): Separating daytime from nighttime

Screen Link:
https://app.dataquest.io/c/95/m/524/guided-project%3A-finding-heavy-traffic-indicators-on-i-94/3/traffic-volume-day-vs-night

My Code:

day = traffic['date_time'][(traffic['date_time'].dt.hour >= 7) & (traffic['date_time'].dt.hour < 19)]
night = traffic['date_time'][(traffic['date_time'].dt.hour >= 19) & (traffic['date_time'].dt.hour < 7)]

Probably quite a simple question here – above is my original code for attempting to filter within the ‘date_time’ column to the correct rows for either daytime or nighttime, based on the criteria we were given. I seemed to get the correct number of rows for the daytime data…but not the nighttime data.

I’ve popped in the solution code below. Just to check my understanding, why is it necessary to copy the full dataframe BEFORE using series.dt.hour to filter to the correct rows in the dataset?

day = traffic.copy()[(traffic['date_time'].dt.hour >= 7) & (traffic['date_time'].dt.hour < 19)]
night = traffic.copy()[(traffic['date_time'].dt.hour >= 19) | (traffic['date_time'].dt.hour < 7)]```
1 Like

I would like to know the same thing… why is it necessary to make a copy of the dataframe before using series.dt.hour?

I used loc and it does the trick.

traffic_day = traffic.loc[(traffic[‘date_time’].dt.hour >= 7) & (traffic[‘date_time’].dt.hour < 19)]

traffic_night = traffic.loc[(traffic[‘date_time’].dt.hour >= 19) | (traffic[‘date_time’].dt.hour < 7)]

2 Likes

Do you know the logic of why we need to use .loc?

I think the issue is that you are using an ampersand (&) in the 2nd line of code instead of the vertical bar(| ). The time can should either be more than 19 OR less than 7. Hope that makes sense!

loc is used because we are filtering rows.

Note: This reply is somehow academic, the way it works has been posted already with the use of Series.loc

I run through the same issue and I want to contribute with the following:
-Filtering without Series.loc works for daytime, it should work for nighttime too.

  • If you generate two DataFrames, one with > 0 & <7 and a second one with >19 and then concatenate the two DataFrames you can generate the " night" DataFrame without Series.loc

Conclusion: Somehow Pandas does not like the >19 & <7 statement (passing through 0)

Example code without Series.loc

nighttime_1=Interstate_Traffic[(Interstate_Traffic ['date_time'].dt.hour > 0) &  (Interstate_Traffic ['date_time'].dt.hour < 7)]
nighttime_2=Interstate_Traffic[(Interstate_Traffic ['date_time'].dt.hour > 19)]
night = pd.concat([nighttime_1,nighttime_2])

1 Like