How to read hex bin plots?: Guided Project: Visualizing Earnings Based On College Majors

Somethings I would love to get insights to from others, on this guided project are:

  1. Just as Box plots reflect the 5 point summary of stats about the data, is there a way to read a a hex bin plot ? Apart from finding out that denser points produce darker hexagons, I struggled how to use a hex bin plot to make concrete observations.

  2. I learnt what is data skewness using the box plots , thanks to this project:-). However, I wondered what benefit does reading a box plot for quartile distribution offer …because summary stats provides those numbers ( minimum, First and 3rd quartiles , Max value etc. )

  3. I also wasn’t sure how to go about picking number of bins for plotting a Histogram. I used pd.cut function to bin the column data and mostly used that same number of bins to draw the plot later. I did that to support my observations from the plot, but otherwise the non-default number of bins to use is not something I could guess

  4. There’s df.plot.hist vs df.hist. Is it ok to always use df.plot.hist?

URL of the last mission screen of the Guided Project :

Github link:

Thanks much!

1 Like