Screen Link:
My Code:
import matplotlib.pyplot as plt
import numpy as np
num_cols = ['RT_user_norm', 'Metacritic_user_nom', 'IMDB_norm', 'Fandango_Ratingvalue', "Fandango_Stars"]
bar_heights = norm_reviews[num_cols].iloc[0].values
bar_positions = np.arange(5) + 0.75
fig, ax=plt.subplots()
ax.bar(bar_positions, bar_heights, 0.5)
plt.show()
What I expected to happen:
The code is correct but why do we need to do bar_heights = norm_reviews[num_cols].iloc[0].values ?
Can you please explain what this does. norm_reviews was made from those columns so why do we need to filter using those again? And why is iloc[0] the first row and not say, the first element of the num_cols list?
What actually happened:
Replace this line with the output/error
1 Like
I would recommend that you go through the instructions in Step 2 again to see how norm_reviews
was created.
The documentation for iloc (especially the examples) should help clarify what iloc
is used for. If it’s not clear from the documentation feel free to ask more questions.
1 Like
Hi @the_doctor
I have asked myself the same questions as @malickke2 has. So, based on your comments, it is correct to say that for
bar_heights = norm_reviews[num_cols].iloc[0].values
norm_reviews is a DataFrame (a 2d structure: rows/columns) created from the original DataFrame called reviews.
num_cols is a list specifying which columns we want from norm_review. Alternatively, we could have used ‘RT_user_norm’, ‘Metacritic_user_nom’, ‘IMDB_norm’, ‘Fandango_Ratingvalue’, ‘Fandango_Stars’… but that would make the code less readable.
.iloc[0] indicates the integer position we want, in this case, integer at position 0. So, it is the first element of each column in num_columns, which coincides with the first row of num_col (?) <- not sure.
.value to return the values, given the description above
Finally, we assign those values to bar_heights.
What I don’t understand (see below) is why a) returns the name of the movie along with the values whereas b) returns only the values?
a) bar_heights = norm_reviews.loc[0].values
b) bar_heights = norm_reviews[num_cols].iloc[0].values
I appreciate any feedback or tips 
That’s correct!
Yes, iloc
is used to get the data from the dataframe at the specified position. So, iloc[0]
would be the data at position 0
of the dataframe.
To phrase this in a better way, it would be the first row of norm_reviews
for the columns specified in num_cols
.
Because you are using only norm_reviews
in that code. The first column of norm_reviews
is FILM
so that’s why that is returned as well.
The above should also answer your b
part as well.
1 Like