Screen Link:
https://app.dataquest.io/m/146/guidedproject%3Avisualizingearningsbasedoncollegemajors/1/introduction
I am not able to understand that the what these columns mean in the dataset?
 Sample_size
 ShareWomen
 Median
The meaning in the sense that what data is present in these columns as it’s useless to go further until I understand dataset fully.
Hello @joshi.ananya.joshi1,
one can find the description of each column on the left side:

Sample_size
 Sample size (unweighted) of fulltime.

ShareWomen
 Women as share of total.

Median
 Median salary of fulltime, yearround workers.
Is this it, or did you mean the more detailed info, like, e.g. median
, sample_size
, ShareWomen
of what population are analyzed?
1 Like
This is what is mentioned in the screen i want to understand what this actually means? This meaning given is also not understandable clearly i want to understand this.
I think I get it now.
So, the dataset is filled with survey results of “job outcomes of students who graduated from college between 2010 and 2012.”
The data is separated by Major
s, so:

Sample_size
is not definitely clear; I take an educated guess that this is the number of people for which the earnings were calculated; The github repository from which recent_grads.csv
is taken is more specific on Sample_size
: Sample size (unweighted) of fulltime, yearround ONLY (used for earnings) (bolded by kakoori)

ShareWomen
is the fraction of women with respect to the total number of students surveyed, e.g. for “Molecular Biology” major there are 10874 women majors for total of 18300, so the ShareWomen
is 10874/18300 = 0.59420765, which is rounded to 6 decimals giving 0.594208

Median
is the salary which separates the sample group in half with respect to earnings, e.g. for “Molecular Biology” major the median is $40 000  50% of fulltime, yearround workers earn more than this and 50% earn less.
Is this what you are looking for?
2 Likes