Guided Project #4: Correlations and Dependencies of I-94 Traffic

Hello!
Here is my fourth guided project, analysis on I-94 Interstate highway.
The goal of this work was for me to learn some correlation methods and visualization tools.
And of course, how to interrelate the result of the analysis.

I will appreciate any feedback.
Thanks in advance.

Traffic_Indicators.ipynb (708.1 KB)

Click here to view the jupyter notebook file in a new tab

3 Likes

Hi @sv.dolgushina! Thanks for sharing your project with us and being a Community Champion last week:) I liked how thoroughly you cleaned the data giving clear explanations of why the data are inaccurate. Also, good job about the plots, they are nice and contain all the information (title, labels, etc) necessary for their comprehension. Finally, you’ve carried out a nice analysis about holidays, that’s really cool!

Some feedback from my side:

  • You can talk more about the purpose of the project, like saying that you want to find correlations between traffic volume and weather conditions
  • You have plenty of backslashes, \ throughout the project, why’s this happening?
  • You have some typos, correct them
  • The plot Intensity of Traffic Volume per Hour could be extended horizontally to distinguish better the values and it’ll also fit the title this way which can be made a bit smaller
  • The chart “Traffic Volume at Night” can be estimated as left skewed with an additional peak in the middle. - the distribution is actually right-skewed
  • Make sure to write plt.show() in [191] so that matplotlib does not print technical information. Do the same for the next plots
  • Some of your .describe() code cells are not very informative. For example, what should we learn from [200]? Do we really need these cells?
  • The average traffic of weekdays more than weekend trafic on 48 % and the following could be rewritten in a more English-sounding way like The average traffic volume on weekdays is higher than on weekends by 48%.
  • After [206] you have some troubles with the links
  • (snowfall in April-October is a rather trivial phenomenon in this area). - do you mean uncommon phenomenon?
  • The plot Traffic Volume by Detailed Weather Conditions has a lot of y-axis labels and a lot of colors. A better idea here would be to make the bars that are below the threshold gray and the rest of them some other single color to attract the reader’s attention to this important information
  • You could write a more in-depth conclusion by talking about exact numbers like the fact that traffic is more intense on weekdays, especially around 7:00 AM and 4:00 PM, and suggest what causes these peaks (people coming to and from work?)

I hope I was helpful! Happy coding and good luck with your next project :grinning:

Hi, thanks for your feedback.
I can give some explanations:

  • I admit that my English is far from perfect, I’m working over this issue,
  • Purpose of the project was to demonstrate the current level of my skills in coding, and from the other side, to gain some confidence since even will a very limited arsenal of tools and methods I’m able to make a simple analysis of the small dataset, understand what I’m doing and be able to interpretative the result. I don’t believe in the trendy motto of online courses “from junior to senior in 6 months”. And with the absence of a base in programming, I started from the “Zero”, not from “Junior”. Well, from 0.001, since I knew how “Select* from” works.
  • Correlation could not be the purpose of the project, correlation is a more complicated process than using the .corr() method, especially when it concerns weather conditions. Trust me, it was in my graduating thesis)
  • Backslashes- I hate to scroll through the endless lines of code, so I use them to see the code in the cell as a whole. It works in Jupyter Notebook, but doesn’t work in nbviewer. Let’s it be my style, okay?
  • When I will complete the course of visualization, I will know about chart formatting more, okay?
  • I used colouring charts to add some colour to the project which became rather boring at this stage: no data, no dependences.
  • Conclusion: we were warning that the data was gathering in ONE direction: “the station only records westbound traffic (cars moving from east to west).” If the people come to work, then at the same time the equal amount of cars must move overwise, or the evening peak could not happen. It leads us to the idea that both cities on the ends of the highway have an equal amount of working places. Then why on the earth they cannot work in their places and do not load the road? And if they come to work, they must stay inside the city, right? But we see the traffic remains intensive from 6:00 to 18:00. Again, on the ONE half of the road.
    To explain why all these vehicles stay on the road for the day long, we need to explore the economical situation of the area. So, guessing the reasons for the situation on the highway is divination on coffee grounds. It’s over the current project.
    Just my opinion.
    -About .describe()- nothing special, mostly for myself, to be sure I’m going in the right direction.
    • (snowfall in April-October is a rather trivial phenomenon in this area). - do you mean uncommon phenomenon? - accordingly the climate description snowfalls in April and October are common causes. Anyway, we do not have the proper amount of information about rains and snow in the dataset.

Again, thanks for the detailed feedback.

1 Like