Just now I finished the guided project “Finding the best markets to advertise in”. (Click here for the final page of the instructions.)
This was challenging. More than I expected I was plagued with issues in the data. E.g. the data contains individual columns JobInterestWeb, JobInterestMobile, etc. and then there is a summary column JobInterest with entries like “Web; Mobile; Security”. Does this column contain the same information as those individual columns? No, it doesn’t seem to be the case! I figured that out later only, then had to deal with it. Also the data about ‘money spent’ took a lot of digging into (e.g. due to outliers) before I dared to come to any conclusions based on it. Then there were the ‘regular’ challenges like trying to get decent plots out of Matplotlib (which I consider a big challenge always, and at the end decided to spend less effort on this time).
So, very time-consuming, but good learning then, I guess!
I have uploaded my notebook here. Any feedback is always welcome!
FindingBestMarketsAdvertize.ipynb (599.9 KB)
Click here to view the jupyter notebook file in a new tab
Interesting take on this project! It’s much different from mine. I actually enjoyed this project including playing with matplotlib lib.
I did try to work on the mismatched part of the data, however, I am not sure why you didn’t get any warnings for Type Error or perhaps you haven’t mentioned it. The column I was given this error for is
"JobInterestOther" which had an object datatype for me, so the sum(axis = 1) at code cell 19 did not work for me at all. This was like a potential candidate for the mismatch for the results of ode cell 20.
Overall I liked the idea of you describing the project for the analysis, the confusing dataset and observations rather than a survey project based on which decisions need to be made.
I have been told in a couple of interviews that it’s not the result but more about how you approached the problem and its basis.
You could start with simple styles with matplotlib (like best, for now, is FiveThirtyEight one!) and then maybe move on to customising plots according to your liking. Anywho I guess you will find it rewarding as you proceed in your learning journey. As it is
Hi @Rucha ,
Thank you so much for your response and feedback! (Sorry, I didn’t have the chance to get back to you until today.)
I actually did run into Type Errors as well, also for
JobInterestOther. I describe this in cell 2, including how I dealt with it. And now I take a look at it again, indeed this column now has a different type than all the other JobInterest*** columns. So that could indeed be related. To be explored. Great catch!
Yes, matplotlib… I will have to invest time in this to understand it better and get faster with it. Right now it takes an disproportionate amount of searching on the web, trial and error (and sometimes frustration) to do simple customizations to my plots.
Anyway, thank you again!!