Problem with norm_reviews[num_cols].iloc[0].values Bar Plots

Can some body please help me with this part bar_heights = norm_reviews[num_cols].iloc[0].values
I did not get it.
import matplotlib.pyplot as plt
from numpy import arange
num_cols = [‘RT_user_norm’, ‘Metacritic_user_nom’, ‘IMDB_norm’, ‘Fandango_Ratingvalue’, ‘Fandango_Stars’]

bar_heights = norm_reviews[num_cols].iloc[0].values
bar_positions = arange(5) + 0.75
fig, ax = plt.subplots()
ax.bar(bar_positions, bar_heights, 0.5)
plt.show()
this is the link
https://app.dataquest.io/m/144/bar-plots-and-scatter-plots/4/creating-bars
thanks

In the section of code you asked about (bar_heights = norm_reviews[num_cols].iloc[0].values), we are getting the values from the 5 columns in num_cols for the first movie in the dataframe (.iloc[0]). Adding .values returns the data as a list (you can see that when you run the code and inspect the bar_heights variable).

I hope this helps. Let us know if you still need some clarification.

why this same two lists together ? a little bit confused
norm_reviews = reviews[[‘FILM’,‘RT_user_norm’,‘Metacritic_user_nom’,‘IMDB_norm’,‘Fandango_Ratingvalue’,‘Fandango_Stars’]]
num_cols = [‘RT_user_norm’, ‘Metacritic_user_nom’, ‘IMDB_norm’, ‘Fandango_Ratingvalue’, ‘Fandango_Stars’]
thanks

They’re not the same, actually. reviews was the initial dataframe, and norm_reviews is the dataframe with only those 6 named columns. It still contains the same number of rows but allows us to only focus on the columns we need for our analysis. num_cols is a list with the titles of the 5 columns that have rating data (not including the 'FILM' name).

The reason we put the names of those columns in a list is to make our code easier to read when we want to extract the ratings values for the bar heights. These two bits of code will give us the same information:

bar_heights = norm_reviews[num_cols].iloc[0].values
bar_heights = norm_reviews[['RT_user_norm', 'Metacritic_user_nom', 'IMDB_norm', 'Fandango_Ratingvalue', 'Fandango_Stars']].iloc[0].values

The first version with num_cols is easier to read, especially if we have to troubleshoot later or edit which columns to include in our bar graph.

thank you very much. I got it.

1 Like