Which category of majors have the most students?

Hi All,

I am stuck on the last question on “Which category of majors have the most students?” using the bar plot.

When I use the following code:

recent_grads.plot.bar(x=‘Major_category’, y=‘Total’, legend=False), it gives me the following image. Instead of counting the major categories together, it seems like it has treated it separately. Could any one please advise? Thanks in advance!

image

Hi @jinxianlum: to allow us in the community to better assist you, please ensure your query conforms to these guidelines.

Thanks!

Hello @jinxianlum

Thanks for posting.

Unfortunately , your question is not clear. Would you mind in sharing it in detail so that we can assist you better on your problem.

Thanks for understanding.

Best
K!

Hi All,

Refer to Page 1 of the Guided Project (3rd Question)

  • Do students in more popular majors make more money?
    • Using scatter plots
  • How many majors are predominantly male? Predominantly female?
    • Using histograms
      *** Which category of majors have the most students?**
      ** * Using bar plots**

I am having difficulty in using the bar plots to deduce the category of majors with the most students. When i call out for the bar plot “recent_grads.plot.bar(x=‘Major_category’, y=‘Total’, legend=False)”, the bar plot individually plots each row of the dataframe. And that resulted in a distorted graph.

Could any one please advise?

I hope my question is clear enough.

Thank you.

hi @jinxianlum

screen link attachment would have made it much clearer. But try replacing bar with hist. (rest of the code is as is. )

Hi Rucha, thanks for your reply. However, the question in the Guided Project refers us to using bar plots to find out which majors most of the students are in. I.E. Engineering. If you use recent_grads[“Major_category”].describe(), it easily shows Engineering as the top major.

I have screenshot the code and bar plot for reference below.

hi @jinxianlum

screen link was seriously needed here, because I was looking in the actual instructional pages for the projects rather than the text content.
and then the disrupted formatting in above post added to confusion.

I guess this is what you are asking about, from this page

image

Well, have you learned the GroupBy method in pandas? Perhaps utilizing that here will make sense to get this bar plot. Let me know then we can take up groupby method or else will try something else.

image

Hi Rucha,

Yes! You have screenshot the right page. It is the last question. Sorry about not being clear earlier. I have not learned the GroupBy method. Yes, would you be able to show the groupby method?

Thank you!

JX

hi @jinxianlum

official doc is here and this article has good text detail here

Simplest explanation that I could device right now is as below:

# this is how we use group by method : the dataframe recent_grads will get divided into groups for each of the major_category
# we pass the column name or names as list based on which we want to group the df
recent_grads_group = recent_grads.groupby(["major_category"])

# if you try to print it, it will just give you object mentio not the actual grouped df (it work in background)
print(recent_grads_group, "\n")
print(recent_grads_group["total"], "\n")

# we just need aggregation/ summation of the total column from this grouped df
total_students = recent_grads_group["total"].sum()

# the sum() method will result in a series
print(type(total_students), "\n")

# use this series to plot the bar-chart - I used horizontal bar-plot
total_students.plot(kind = "barh", figsize = (8, 6))

plt.show()

Hope this helps.

1 Like

Thanks Rucha. This has clearly helped!

1 Like