Great that you took the time and the effort to revise your project.
Regarding the histograms: What I was trying to say is that using fixed bin sizes for different variables might be a good or a not so good idea depending on the data. Obvously c&p-ing code 8 times and just changing variables and bin sizes is not a good idea either. You want to have the plots generated in a loop (as in your implementation). What I would probably try to do is wrap the body of your loop in a function and call the function for each variable with a specific bin size. In your case maybe something like this (I just made up the bin sizes for demonstration, so please don’t use them in your actual analysis):.
cols = [
# Same length as cols
bin_sizes = [4, 8, 6, 12, 4, 12, 7, 5]
# Define a plotting function
def plot_hist(df, col, bin_size):
"""Plot histogram for supplied variable and bin size."""
fig = plt.plot(figsize=(10,5))
sns.histplot(data=df, x=col, bins=bin_size)
# Aggregates elements from each of the iterables supplied and returns tuples.
for var in zip(cols, bin_sizes):
# Pass arguments to plotting function for each iteration
plot_hist(recent_grads, var, var)
This way you can have custom bin sizes with almost the same amount of code.
Maybe this helps for future projects.
BTW: The part
fig = plt.subplots(0,8, figsize =(10,5)) in your code doesn’t really work, because you are overriding the fig variable with every pass of the for-loop. So, you just get a plot for every variable and not 8 subplots in 1. I don’t think you actually need to have subplots here, so you can just use
fig = plt.plot(figsize=(10,5)).