Best Markets to Advertise In - Maybe

The data file provided for this project certainly needed much gleaning and cleaning with hopes of leading to a reliable decision for best markets to advertise in. How good was the data? How good were the details provided by management? Find out by reading this report.
All feedback on this project is welcome.

Best Markets to Advertise In.ipynb (826.6 KB)

Click here to view the jupyter notebook file in a new tab

1 Like

Hi Bruce,

Thanks for sharing another amazing project with the Community! I really liked your innovative approach to this project (instead of just following the instructions), using Pareto charts and giving some background to this dataviz type. As usual, perfect project structure, clean highly readable visualizations, cool covering picture, well-commented code, and very thorough data analysis with interesting insights!

What I would suggest to you:

  • Introduction. You’d better re-phrase it in such a way that the project goal goes first and then goes the information from the 1st paragraph
    (the one about why we don’t want to organize a new survey).
  • Since you created several Pareto charts and box plots in your project, you might consider creating a function for each of these plot types, to avoid repeating the same code.
  • For Pareto charts, I would rotate x-tick labels, otherwise they are vertical now. Also, it’s better to change a plot title for each of them to make it more relevant to each particular plot (i.e., what exactly each plot shows).
  • The code cell [1]: it’s better to show only the trimmed data. Also, I’d suggest you to assign maximum number of columns to display (pd.set_option('display.max_colwidth', 31)) right after importing the libraries and use this maximum column value already for displaying the trimmed data in this code cell.
  • When creating a new dataframe (and, in general, any new variable), it’s highly recommendable to use meaningful descriptive names for them (but, of course, not too long). I mean, creating, for example, df1, df2, df3, etc., it’s easy to get very confused later in the code.
  • Referring long numbers in markdown, you might consider adding commas for better readability. For example, not $200000 but $200,000.
  • This time I noticed something weird happening with column names and pieces of code referenced in markdown: sometimes, they’re displayed wrongly, like \'JobRoleInterest\' in the markdown after the code cell [2]. I think this is somehow related to the fact that you used html elements in you code. It’s better to find and fix such cases.
  • In some code cells (e.g., [2], [3], [5], [7], [9], [11]), you can consider adding inner separators (i.e., relevant print statements) for different outputs, otherwise for now they are attached to each other.

Hope my feedback was useful. Great job your project, and impressive learning pace. Good luck with your future projects!

Thank you so much Elena for your great feedback and suggestions!
Great blessings on your potential new career path!

1 Like