Help to create side by side bar charts in matplotlib

I am trying to do all the guided project again and extend my effort further. I would like to create the following bar chart found in this website: https://fivethirtyeight.com/features/americas-favorite-star-wars-movies-and-least-favorite-characters/

Here is the data:

I tried multiple ways but couldn’t replicate the one side-by-side in above picture, it keeps staking on each other like this:

Appreciate anyone can help on this. Thanks in advance!

1 Like

I never created such a chart, but here’s what I’d do:

I would look at each of the “columns” in the chart you posted as a different chart. Therefore, I would create a figure with five axes, since your data contains five categories. I would do this with a for loop and using fig.add_subplot().

First I would create a list containing the categories:

categories = df.columns.tolist()[1:]

Then I’d loop through the list using the range and len functions. In each loop, I’d add a new subplot. Inside the loop, I’d add an if clause to disable the left label in all subplots, but the first one. I’d also remove all the ticks and spines.

fig = plt.figure()

for i in range(len(categories)):
    ax = fig.add_subplot(1, 5, i + 1)
    ax.barh(df['characters'], df[categories[i]])
    ax.tick_params(bottom=False, top=False, left=False, right=False);

    if i > 0:
        ax.tick_params(labelleft=False)
    
    for kew, spine in ax.spines.items():
        spine.set_visible(False)

I believe that with this you are able to get closer to your goal. I’m not sure everything I did here will work because I did not test it on the actual data. If you want to share the actual file instead of an image of the data, perhaps I’ll give more precise information. But if what I said works, it won’t be hard to place the numbers in front of the bars.

Hope this is helpful to you.

2 Likes

It would be better if you would have provided the data inform of a csv file rather than a photo.
But anyways

  1. melt the data using pd.melt to unpivot the data from wide to long format
melt_df = pd.melt(characters_freq, id_vars=['characters' ],  value_vars=['favorable', 'somewhat_favorable' , 'neutral', 'somewhat_unfavorable', 'unfavorable'], var_name='favorability', value_name='ratings')
  1. now you can plot the barplot: Use catplot() to combine a barplot() and a FacetGrid
g = sns.catplot(data=melt_df, x="ratings", y="a", col="favorability", kind="bar")
plt.show()

then you may apply aesthetics to match your needs


This is my example

3 Likes

Thanks a lot for helping me out.
I did not know there is a way to extract the dataframe to csv, just searched and figured it out. I attach it here if you want to try it out :slight_smile:
characters.csv (616 Bytes)
I tried both of your suggestions and here is the results:

@otavios.s suggestion:

This method works for what I am looking for, I am not good at formatting it yet, but it works. :partying_face:

@info.victoromondi suggestion:

I haven’t practiced much with seaborn so it is a bit hard for me to play around with. This method is quick, but I don’t know how to modify each axes and its color. But I will look into it more, at least I know now it is called FacetGrid ^^.

Thanks again!!!

4 Likes

Well done!

Glad I was of help.

Nice chart! I played with it a little to try and mimic the 538 style.

fig = plt.figure(figsize=(16, 6), facecolor='#F2F5F7')
colors = ['red', 'green', 'blue', 'gray', 'orange']

for i, c in enumerate(data.columns[1:]):
    ax = fig.add_subplot(1, 5, i + 1, facecolor='#F2F5F7')
    plt.barh(data.characters.apply(lambda x: x.replace('_', ' ')), data[c], color=colors[i])
    plt.title(c.replace("_", " "), fontsize=15, fontweight='medium')
    ax.tick_params(left=False, bottom=False, right=False, top=False)
    ax.xaxis.set_ticks([])
    plt.yticks(fontsize=12)
    
    if i > 0:
        ax.tick_params(labelleft=False)
    else:
        plt.yticks(range(14))
    
    for _, spine in ax.spines.items():
        spine.set_visible(False)
    
    for i, val in data[c].iteritems():
        if i == len(data) - 1:
            first_one = False
            ax.text(val + 1, i, str(round(val)) + '%', va='center')
        else:
            ax.text(val + 1, i, str(round(val)), va='center')
    
plt.suptitle("'Star Wars' Character Favorability Ratings", fontsize=20, x=.25, y=1.06, fontweight='bold')
plt.text(-162, 15.9, 'By 834 Respondents', fontsize=16, fontweight='light')
plt.show()

result:

3 Likes

Slight modification makes it look a lot better I think.

plt.suptitle("'Star Wars' Character Favorability Ratings", fontsize=20, x=0.2, y=1.06, fontweight='bold')
plt.text(-173, 15.9, 'By 834 Respondents', fontsize=16, fontweight='light')

3 Likes

Thank you for your inputs. Nice touch up. It looks more like the one from 538 now with the %.
Would you mind to explain how to remember/know when to use ax. or plt. to manipulate the parameters? I get confused and has to search here and there every time, it is like try this or that until something actually works out.

Honestly, I don’t know how to remember those kinds of details. I had to google just about everything I added to that chart. In many cases I skipped the first solution I found because it was too complicated. Maybe there is a simpler way to do this using seaborn?

I wanted to see if I could make one of the columns stand out by making it wider than the others.

It actually worked. Used the gridspec object to create a layout with 6 columns instead of 5. The first chart takes up two columns on the grid and the others take up just one.

import matplotlib.gridspec as gridspec

def add_ax(grid_loc, title, color='#FF8C00', show_y_labels=False):
    ax = fig.add_subplot(grid_loc, facecolor='#F2F5F7')
    ax.tick_params(left=False, bottom=False, right=False, top=False)
    ax.xaxis.set_ticks([])
    ax.barh(data.characters.apply(lambda x: x.replace('_', ' ')), data[title], color=color)
    ax.set_title(title.replace('_', ' '), fontsize=15)
    ax.tick_params(axis='y', labelsize=12, labelleft=show_y_labels)
    
    for _, spine in ax.spines.items():
        spine.set_visible(False)
    
    first_one = True
    for i, val in data[title].iteritems():
        if i == len(data) - 1:
            first_one = False
            ax.text(val + 1, i, str(round(val)) + '%', va='center')
        else:
            ax.text(val + 1, i, str(round(val)), va='center')
    
    return ax

fig = plt.figure(figsize=(16, 6), facecolor='#F2F5F7')
grid = gridspec.GridSpec(ncols=6, nrows=14, figure=fig)
cols = data.columns.to_list()

add_ax(grid[:, :2], 'favorable', 'blue', True)

colors=['#85ceeb', 'darkgreen', 'darkgreen', 'darkgreen']

for x in range(2, 6):
    add_ax(grid[:,x], cols[x], colors[x-2])
    
plt.suptitle("'Star Wars' Character Favorability Ratings", fontsize=20, x=0.22, y=1.06, fontweight='bold')
plt.text(-210, 15.9, 'By 834 Respondents', fontsize=16, fontweight='light')
plt.show()    
2 Likes

Your learning effort is amazing. Thank you for sharing.

Thanks for the kind words!