[ I would like to know your opinion in my ] Visualizing Earnings Based On College Majors. 🎒

Greetings,

I would like to share this notebook with all of you basically to know your opinion and to know if the reflections that I have been making during the process of graphics are correct and if not, that is where I would appreciate your opinion.

Also I include a link to the notebook, it’s the first time I’ve done it like this, I hope everything works properlly.

In the event that I can help you, do not hesitate to let me know. I look forward to hearing from you.

Thank you very much.

A&E

1 Like

Hi @Edelberth! Thanks for sharing your project with the Community:) Well done for creating a detailed introduction where you describe the data set and ask questions you want to find the answers for!

Some comments from my side:

  • Your project is partially in Spanish. Could you translate it into English?
  • In [3] if you want to display the first row you can use either .iloc[0] or .head(1). They are better than .iloc[:1]. Anyway, you display 3 rows just in the next code cell [3] is redundant
  • It’s good that you describe all technicalities of the project (like what matplotlib expects from us or what you assign to which variable) but do you think it’s the best way to present the project to someone without a technical background? In your more advanced future projects, it’s better to place technical details inside code cells as comments
  • This phase, This graph explains how a minority of people between 0 and 2500 approximately taking into account that the scale fund with which we are working is 250000, work more than 35 hours is not very clear. Do you mean the number of people between 0 and 25000, or those who have a median salary between 0 and 25000? It’s also better to make the title and axes labels more clear by saying that you are considering the median salary and the number of full-time workers. Don’t you also think that we may get more insights if we shrink the scale to 0-50000 full-time workers?
  • In section Men vs. Median, you that you are using Unemployment_rate but in the plot, it’s the median salary?
  • You’ll generate greater insights if you reduce the scale of your scatter plots. Most of them are skewed towards smaller values so we lose information there
  • In section 2.1. Matplotlib, much more detailed plots., you basically repeat the same code over and over again changing only the r variable, and the number of bins. Could you think of a way you reduce this repetition?
  • Histogram plots lack titles and the last superimposed plot (is it the frequency of women and men?) there are no labels which make the plot unclear
  • The Mayors category type - it’s Major
  • Quickly we observe that women surpass men and that these are more focused on fields other than men. - what do you mean by fields?
  • The number of women is almost double the number of men who are studying. - I don’t think the conclusion is totally correct. The number of women who are studying in the most popular women major is almost double the number of men who are studying the in the most popular men major would be more correct
  • The plot in Plots using matplolib by type of major and gender. lacks a title and is pretty confusing. You have a lot of categories on the x-axis that are difficult to read. It’s better if you can reduce the number of categories and concentrate on the most relevant ones (could they be the ones with the highest number of employees?)
  • It’s better to do all the imports in the first code cell for better readability
  • and also the performance ratio while observing the type of correlation that exists between the three sets - what do you mean by performance ratio? Also, could you briefly describe the correlations you see?
  • Your final plots do not have any title or axes labels
  • The plot of the Unemployment Rate vs ShareWomen has very long numbers on the x-axis. It’s better to group unemployment rates into distinct categories separated by unemployment units (for example, every 0.05 points) and plot those instead
  • I don’t understand what’s the hexbin plot is about. It’s better if you could briefly describe what conclusions can be made using the categories you used to draw the plot
  • At the beginning of the project, you asked some questions you want to answer in the project. The conclusions section is a great place to sum up the findings of the projects

I hope I was helpful. Happy coding and good luck with your next projects, @Edelberth :smile:

3 Likes

I thank you @artur.sannikov96 for being so clear. :face_with_monocle: and for the time and effort you have made, I assure you that I will keep your comments very close.

I copy what you tell me, and your words will be the main lines of the modifications I make in the future.

I am very grateful to you… you have a friend :slightly_smiling_face: around here.

A&E.

1 Like

Hello @artur.sannikov96 !

Thanks to your comments I think that the project that you saw at the time I have been able to improve. I tell you this because it may seem that the time one spends writing the comments is not taken into account but this is not the case.

I leave you the link to see what you think.

Thanks! :wink:

A&E.

1 Like

Hey @Edelberth! Well done on improving the project :slight_smile:

I’m glad to comment on it:

  • It’s not clear to me why you are checking the encoding of the file. Did you experience any issue opening it before?
  • Providing a data dictionary is a great idea but you can also briefly describe the data: like, this is the data about fresh graduates from 16 majors in Americal colleges with demographic info on their employment status :slight_smile:
  • The amount of women is higher than men. — the number, not the amount. The amount if used for uncountable nouns
  • Increase the x and y label size for your plots, and remove the top and right spines; it’ll considerably improve the readability
  • The average income of full-time workers throughout the year is a small portion of the total number of working men. — what does it mean? How can the average income be a portion of the total number of working men?
  • el percentil se basa en la ordenación de los elementos de menor a mayor hasta llegar al porcentage que se requiera, en esta caso vienen determinados por el 25% 50% que es la mediana y el 75%: — translate this sentence into English
  • I also believe that salaries are not prices :slight_smile: I’m referring to some variable names you have
    • Common major sorted by genres.* — Genres? Do you mean “genders”?
  • You could have also checked how the number of employees changes depending on major/industry sector
  • What do you mean by Comparison of history between men and women. ? What is history?
  • When you are checking which majors are predominantly male why don’t you use percentages? Raw numbers do not tell us if there is any predominance but depend only on the total number of employees in those majors. Two identical percentages will give two different fractioned raw numbers if the totals are different. For example, you have “Psychology” both predominantly male and female which does not make any sense
  • In Which category of majors have the most students? provide a title and axes labels. The legends are also not necessary if you tell whether you are talking about men or women in the title.
  • Quickly we observe that women surpass men and that these are more focused on fields other than men. - I have already told you about this in my previous comment. What does “fields” mean?
  • What are you trying to show with the logarithmic plots?
  • You have a lot of plots for different major categories. You can improve them by ordering different subcategories by the number of employees
  • Comparing percentages of women (ShareWomen) from the first ten rows and last ten rows. — why do you compare them?
  • After [90], why you don’t describe the plot “Sample size - Median”?
  • Few women in the total have access to the full spectrum of salaries Students in the most popular careers earn money the worst paid is psychology the one that is in the average is general business the best paid is nursing — add some punctuation marks
  • which we can see in the hexagonal plot of containers. — what do you mean by “containers”?
  • this will have a very important impact on all levels of future society. - what is “this”? Your project? The conclusions?

Hope my feedback will help you improve the project. Happy coding :slight_smile:

2 Likes

Yep! @artur.sannikov96

thank you for looking at your work and answering, we know that an email of this style takes an important time.

  • It’s not clear to me why you are checking the encoding of the file. Did you experience any issue opening it before?

I understand. the reason why I have done this has been because it seems to me a very good practice to know if the dataset we are working is ok or not and also , I think it is not too much to incorporate it even knowing how we already know that there is no problem.

  • Providing a data dictionary is a great idea but you can also briefly describe the data: like, this is the data about fresh graduates from 16 majors in Americal colleges with demographic info on their employment status :slight_smile:

Certainly, I think it is a very good practice because I can better introduce the person who is watching it to the context.

  • The amount of women is higher than men.* — the number, not the amount. The amount if used for uncountable nouns.*

Thank you for the nuance understood. To be honest I had missed

  • Increase the x and y label size for your plots, and remove the top and right spines; it’ll considerably improve the readability.

I take note :memo:

  • The average income of full-time workers throughout the year is a small portion of the total number of working men.* — what does it mean? How can the average income be a portion of the total number of working men?

… once again shakespeare lied to me again :hushed:

  • el percentil se basa en la ordenación de los elementos de menor a mayor hasta llegar al porcentage que se requiera, en esta caso vienen determinados por el 25% 50% que es la mediana y el 75%:* — translate this sentence into English

…shakespeare :grimacing:

  • I also believe that salaries are not prices :slight_smile: I’m referring to some variable names you have

:no_mouth:

    • Common major sorted by genres.* — Genres? Do you mean “genders”?

It seems. :see_no_evil:

  • You could have also checked how the number of employees changes depending on major/industry sector

Of course but as I spend some days in front of this analysis I realize that there are things that escape.

Without a doubt it is in the details where the devil lives.:japanese_ogre:

  • What do you mean by Comparison of history between men and women. ? What is history?

I wanted to say comparison of histograms between…

Graphs next to each other so that you can better appreciate what is the distribution among the genres

  • When you are checking which majors are predominantly male why don’t you use percentages?

Simply because I didn’t take it into account, thanks for telling me.

  • Raw numbers do not tell us if there is any predominance but depend only on the total number of employees in those majors. Two identical percentages will give two different fractioned raw numbers if the totals are different. For example, you have “Psychology” both predominantly male and female which does not make any sense

it seems that I know less than I thought, so I will have to review those concepts again.

  • In Which category of majors have the most students? provide a title and axes labels. The legends are also not necessary if you tell whether you are talking about men or women in the title.

if it is true that it is not the first time that you tell me and not pretending that it is an excuse I have realized where it comes from, and that is that as I separate the cells with titles and in them I point it out when I have seen the notebook 100 times I am guided by the title of the cell and not by that of the graph.

  • Quickly we observe that women surpass men and that these are more focused on fields other than men.* - I have already told you about this in my previous comment. What does “fields” mean?

You don’t understand? when referring to the fields of one and the other, it refers to what is the scope It looks like it is again a mistake due to the Castilianization. I will try not to be so ambiguous if that is what does not allow you to understand it.

  • What are you trying to show with the logarithmic plots?

I found it interesting to see what was the function that represented the behavior and that it was also not linear, but certainly I had to have explained it and I do not consider that it proceeded because it is a topic that until much later is not referred to.

  • You have a lot of plots for different major categories. You can improve them by ordering different subcategories by the number of employees

I will try what you tell me and if it seems good to you I will show you to comment on the play.

  • Comparing percentages of women (ShareWomen) from the first ten rows and last ten rows. — why do you compare them?

It looks like:

  • I have forgotten or

  • Being a notebook with a measure that prefers to do it at the height at which I am working and thus avoid having to go up and down.

  • After [90], why you don’t describe the plot “Sample size - Median”?

true, there is no reason (fatigue).

  • Few women in the total have access to the full spectrum of salaries Students in the most popular careers earn money the worst paid is psychology the one that is in the average is general business the best paid is nursing — add some punctuation marks

True.

  • which we can see in the hexagonal plot of containers. — what do you mean by “containers”?**

I read over there that hexagonal cells were called that (containers), simply because of that.

I correct and continue.

  • this will have a very important impact on all levels of future society. - what is “this”? Your project? The conclusions?

Of course (The conclusions), “this” refers to what I have just explained.

it’s easy to understand by the context, isn’t it?

I do not understand that this can give rise to any kind of confusion.

Conclusions:

  • I will not deny that sometimes it has been difficult for me understand what is asked in the statement and that many of the things you have pointed out are the consequences of it.

  • The fatigue and shakespeare often come together.

  • Percentages and graphics name, as soon as I finish what I am doing step to it. :memo:

  • I have realized (has it happened to you?) that at least to me when I am redoing a project, sometimes I have found myself in the situation where I can apply many more (powerfull) things learned than when I did it but also I run the risk of completely getting out and sometimes it’s not easy know.

Once again I thank you for the time you have dedicated, without you I wouldn’t have gotten here, your emails are like medicine😖, it is difficult to take but they heal😁.

Thank you mate, I hope :place_of_worship: I can help you one day.

Lets go to work again.

A&E.

2 Likes

Hello, I’m not trying to review your project, but from reading your discussions and replies in this thread, I do have to say that you’re a very positive and inspiring person. I especially like your receptivity to feedback and your dedication to improve based on those feedback, even when you admitted that those feedback can be difficult to take.

Your overall positive attitude is something I aspire to have one day, but I wish I have it right now.

Anyhow, keep on being awesome @Edelberth.

2 Likes

Hello @wanzulfikri :tophat:

I am very grateful for the words but do not forget that the helper is making an effort (big) to understand what I have done and spending time of his life for answer. For me that is the really awesome stuff, I can not be more grateful so everything else is just noise. :boom:

A pleasure you are here. :milky_way:

A&E. :selfie:

2 Likes

True, the helper is very helpful, but feedback is a collaborative effort between the helper and receiver. It won’t work if only one side is doing all the work. In terms of giving feedback, many feedback givers really want to help, but most of the time the receiver is very resistant or not responsive to it, so it feels as if giving feedback is useless. In other words, a lot of people want to help others, but sometimes they feel as if the receiver doesn’t appreciate them. This focus on the receiver of the feedback rather than the giver is something preached by the book Thanks for the Feedback.

Maybe a better way to phrase my compliment would be to say that both you and @artur.sannikov96 should be commended for the constructive discussion.

Cheers.

1 Like