Completely Lost on Matplotlib

Screen Link: https://app.dataquest.io/m/149/guided-project%3A-visualizing-the-gender-gap-in-college-degrees/2/comparing-across-all-degrees

fig = plt.figure(figsize=(16, 20))

## Generate first column of line charts. STEM degrees.
for sp in range(0,18,3):
    cat_index = int(sp/3)
    ax = fig.add_subplot(6,3,sp+1)
    ax.plot(women_degrees['Year'], women_degrees[stem_cats[cat_index]], c=cb_dark_blue, label='Women', linewidth=3)
    ax.plot(women_degrees['Year'], 100-women_degrees[stem_cats[cat_index]], c=cb_orange, label='Men', linewidth=3)
    for key,spine in ax.spines.items():
        spine.set_visible(False)
    ax.set_xlim(1968, 2011)
    ax.set_ylim(0,100)
    ax.set_title(stem_cats[cat_index])
    ax.tick_params(bottom="off", top="off", left="off", right="off")
    
    if cat_index == 0:
        ax.text(2003, 85, 'Women')
        ax.text(2005, 10, 'Men')
    elif cat_index == 5:
        ax.text(2005, 87, 'Men')
        ax.text(2003, 7, 'Women')

## Generate second column of line charts. Liberal arts degrees.
for sp in range(1,16,3):
    cat_index = int((sp-1)/3)
    ax = fig.add_subplot(6,3,sp+1)
    ax.plot(women_degrees['Year'], women_degrees[lib_arts_cats[cat_index]], c=cb_dark_blue, label='Women', linewidth=3)
    ax.plot(women_degrees['Year'], 100-women_degrees[lib_arts_cats[cat_index]], c=cb_orange, label='Men', linewidth=3)
    for key,spine in ax.spines.items():
        spine.set_visible(False)
    ax.set_xlim(1968, 2011)
    ax.set_ylim(0,100)
    ax.set_title(lib_arts_cats[cat_index])
    ax.tick_params(bottom="off", top="off", left="off", right="off")
    
    if cat_index == 0:
        ax.text(2003, 78, 'Women')
        ax.text(2005, 18, 'Men')

## Generate third column of line charts. Other degrees.
for sp in range(2,20,3):
    cat_index = int((sp-2)/3)
    ax = fig.add_subplot(6,3,sp+1)
    ax.plot(women_degrees['Year'], women_degrees[other_cats[cat_index]], c=cb_dark_blue, label='Women', linewidth=3)
    ax.plot(women_degrees['Year'], 100-women_degrees[other_cats[cat_index]], c=cb_orange, label='Men', linewidth=3)
    for key,spine in ax.spines.items():
        spine.set_visible(False)
    ax.set_xlim(1968, 2011)
    ax.set_ylim(0,100)
    ax.set_title(other_cats[cat_index])
    ax.tick_params(bottom="off", top="off", left="off", right="off")
    
    if cat_index == 0:
        ax.text(2003, 90, 'Women')
        ax.text(2005, 5, 'Men')
    elif cat_index == 5:
        ax.text(2005, 62, 'Men')
        ax.text(2003, 30, 'Women')
        
plt.show()

I am currently working on one of the Metplotlib guided projects and spent little over an hour trying to figure it out until I ended up looking at the solution. This is the most lost I’ve been in this course so far. I do not understand how the range is working in this manner as well as the cat_index. If someone could help me understand just what this is doing, and how I could of figured this out on my own. Maybe I missed it in the course but I don’t remember learning it in this manner. I tried looking into the documentation before looking into the solution but was not successful. How could I have figured this out on my own?

3 Likes

Hey @burnsdillion

Just a simple tip. When you are unsure of a for-loop or any variable, just print the variables being used to see their results. such as:

for sp in range(0, 18, 3):
    cat_index = int(sp/3)
    # some code

Try to modify the code like this:

for sp in range(0, 18, 3):
    print(sp)
    cat_index = int(sp/3)
    print(cat_index)
    # some code

That way you would know how the values are changing and it may just help you to at least experiment and understand further, if not right away enabling to understand the solution/ code completely.

Regarding “range”, it is a function in python that takes 2 optional parameters and 1 required parameter.
a simple explanation is on W3schools via below link:
https://www.w3schools.com/python/ref_func_range.asp

so range(0, 18, 3) - indicates the starting number is 0, then the loop needs to run till we reach 18 and each loop needs to be incremented by 3.

so then print(sp) would yield results as - 0, 3, 6, 9, 12, 15. Not 18 because we have limited it to 18. It’s like run the loop till upper limit 18. Once it reaches there the loop will stop.

Now coming to cat_index = int(sp/3)

This will result in:
SP = 0 >> cat_index = int(0/3) = 0
SP = 3 >> cat_index = int(3/3) = 1
SP = 6 >> cat_index = int(6/3) = 2

SP = 15 >> cat_index = int(15/3) = 5

Now the cat_index will give you major element present at that index:

But the subplots are from left to right then top to below such as below:
image

So we need a way to plot the STEM category Majors in the Column1 only from Row1 to Row6.
Now comes the code ax = fig.add_subplot(6,3,sp+1) to the rescue.

Here the row = 6, column = 3, just like our matrix and for each loop the “SP” was being incremented by 3 via range function.

So the SP + 1 would yield this result:

SP = 0 >> SP + 1 = 0 + 1 = 1
SP = 3 >> SP + 1 = 3 + 1 = 4
.......
SP = 15 >> SP + 1 = 15 + 1 = 16

which would give us a correct subplot number to plot the Majors.

Let me know if this has helped you in any way.

26 Likes

Thank you so much!! I think I was more confused with the range() method. I’m not sure why but this cleared it right up.

1 Like

hey @burnsdillion

Glad I could help a fellow learner. :slight_smile:

2 Likes

hello @Rucha, why would we use
for sp in range(0,18,3):

our categories have already been divided into several subject category. which have range of (0,6) for Stem,
(0,5) .for lib_art and
(0,6) for others.

why are we now using range of (0,18) while incrementing by 3??

hey @nnabugwukelvin.chukw

We need a matrix that has 18 subplots divided into 3 columns and 6 rows, and the subplots would start at 1.

So “sp” helps us to achieve that without hardcoding the sub plotting.

Regarding identifying the exact index position/ number in each category list, “sp” is then used to derive/ calculate the value for variable "cat_index".

6 Likes

This is so helpful. Thank you!

2 Likes

thank you the thorough explanation, this really help a lot.

1 Like

Hi, @Rucha, I guess it’s never too late to say thank you! Great explanation.

Up until now I feel like most of the course is quite well explained but this point really struck me. It’s nice though to see how well thought loops can help automate most things.

Thanks again!

1 Like

hey @pcamposginer

Thank you for your kind words. and nope, you are at your own pace so there’s no question of late. I am just glad this post is helping fellow students like me!

Yup, a well thought out and implemented code can really help us a lot!

keep learning and doing awesome.

1 Like
# Group the degrees into STEM, liberal arts, and other:

stem_cats = ['Psychology', 'Biology', 'Math and Statistics', 'Physical Sciences', 'Computer Science', 'Engineering']
lib_arts_cats = ['Foreign Languages', 'English', 'Communications and Journalism', 'Art and Performance', 'Social Sciences and History']
other_cats = ['Health Professions', 'Public Administration', 'Education', 'Agriculture','Business', 'Architecture']

#Set figure width 18 inches and height 3.
fig = plt.figure(figsize=(16, 20))


## Generate second column of line charts. Liberal arts degrees.
    for sp in range(1,16,3):
        # sp        1,4,  7,  10, 13, 16
        # cat_index 0,1/3,6/3, 3,  4, 5 
        cat_index = int((sp-1)/3)
        
        #plots have arguments (6, 3) 
        #denoting that the second column has only two rows, 
        #the position parameters move row wise
        
       
        ax = fig.add_subplot(6,3,sp+1)
        ax.plot(women_degrees['Year'], women_degrees[lib_arts_cats[cat_index]],
                c=cb_dark_blue, label='Women', linewidth=3)

Could you please explain the lib arts for loop?
It seems you would get decimals here which I do not understand what it is doing,
it is a position parameter the third argument saying the number of the plot?

I have seen add_subplot(6,3,sp+1) explained like this:

The third number in each call indicates which axis object to return, starting from 1 at the top left, increasing to the right .

Also in the loop before this for STEM some positions were filled and I don’t understand why these plots would not be overwritten, uness there is no overlap?

#The first for loop 
    ## Generate first column of line charts. STEM degrees.
    #range(start, stop[, step])
    for sp in range(0,18,3):

#The second for loop 
for sp in range(1,16,3):
    # sp        1,4,  7,  10, 13, 16
    # cat_index 0,1/3,6/3, 3,  4, 5 
    cat_index = int((sp-1)/3)

hi @jamesberentsen

the float values will get converted to integers by the int() method.

let’s demo that using this simple code:

sp_values = [1, 4, 7, 10, 13, 16]

for sp in sp_values: 
    # converting the floats to int by casting the division in int() method
    div_by_3_as_int = int(( sp - 1)/ 3)

    # not using int() method for conversion
    div_by_3 = (sp - 1)/ 3

    # printing the two results side by side
    print(div_by_3_as_int, "|", div_by_3)

This would result in the following:
image
so the subplots that each of the categories from the liberal arts list will get assigned to will be from 0 to 5.

1 Like

Hi @jamesberentsen

Oops, my bad on that part. I mixed it up with some other query at hand. I have removed it from the previous post so that it does not create further confusion.

I am really not sure where exactly your confusion is though. You marked @jenil2452000’s answer as a solution in this post created by you,

and he explained the same thing. There is no overlap because we are shifting the sp value by 3. So as

  • STEM category takes the value of 1, 4, 7, 10, 13, 16 as subplot reference,
  • Liberal_Arts category takes the value 2, 5, 8, 11, 14, 17 and
  • Other Category takes the values 3, 6, 9, 12, 15, 18

Since SP starts at 0 ends at 18(excluded) and is increased by 3, SP+1 starts at 0 or 1 or 2, ends at 19(excluded), and is increased by 3.

We are seriously going in a recursive loop about this project. Are you by chance Vik Paruchuri in disguise and testing all of us!? :thinking: :open_mouth:

1 Like

thank you for clarifying the answers I had trouble with cat_index, I was putting in the code stem_cat[i], and it was telling me (list out of index) because my add_subplot was(6,3,1+i), but I have done it differently for the beginning I increase both range and the add_subplot() and I put all the major in one list but then I learned the project way and I changed.

1 Like

I did it differently by using indexing. Seemed more logical to me.
women_degrees = pd.read_csv(‘percent-bachelors-degrees-women-usa.csv’)
cb_dark_blue = (0/255,107/255,164/255)
cb_orange = (255/255, 128/255, 14/255)

stem_cats = [‘Psychology’, ‘Biology’, ‘Math and Statistics’, ‘Physical Sciences’, ‘Computer Science’, ‘Engineering’]
lib_arts_cats = [‘Foreign Languages’, ‘English’, ‘Communications and Journalism’, ‘Art and Performance’, ‘Social Sciences and History’]
other_cats = [‘Health Professions’, ‘Public Administration’, ‘Education’, ‘Agriculture’,‘Business’, ‘Architecture’]
fig,ax = plt.subplots(6,3,figsize=(10,16))

#for sp in range(0,6):
#ax = fig.add_subplot(6,3,18)
for sp in range(0,6):

ax[sp][0].plot(women_degrees['Year'], women_degrees[stem_cats[sp]],
    c=cb_dark_blue, label='Women', linewidth=3)
ax[sp][0].plot(women_degrees['Year'], 100-women_degrees[stem_cats[sp]], 
    c=cb_orange, label='Men', linewidth=3)
ax[sp][0].spines["right"].set_visible(False)    
ax[sp][0].spines["left"].set_visible(False)
ax[sp][0].spines["top"].set_visible(False)    
ax[sp][0].spines["bottom"].set_visible(False)
ax[sp][0].set_xlim(1968, 2011)
ax[sp][0].set_ylim(0,100)
ax[sp][0].set_title(stem_cats[sp])
ax[sp][0].tick_params(bottom="off", top="off", left="off", right="off")

for lib in range(0,5):

    ax[lib][1].plot(women_degrees['Year'], 
            women_degrees[lib_arts_cats[lib]],
        c=cb_dark_blue, label='Women', linewidth=3)
    ax[lib][1].plot(women_degrees['Year'],
            100-women_degrees[lib_arts_cats[lib]], 
        c=cb_orange, label='Men', linewidth=3)
    ax[lib][1].spines["right"].set_visible(False)    
    ax[lib][1].spines["left"].set_visible(False)
    ax[lib][1].spines["top"].set_visible(False)    
    ax[lib][1].spines["bottom"].set_visible(False)
    ax[5][1].spines['top'].set_visible(False)
    ax[5][1].spines['bottom'].set_visible(False)
    ax[5][1].spines['right'].set_visible(False)
    ax[5][1].spines['left'].set_visible(False)
    
   
    ax[lib][1].set_xlim(1968, 2011)
    ax[lib][1].set_ylim(0,100)
    ax[lib][1].set_title(lib_arts_cats[lib])
    ax[lib][1].tick_params(bottom="off", top="off", left="off", right="off")

    
    ax[5][1].tick_params(bottom="off", top="off", left="off", right="off")

for oth in range(0,6):

    ax[oth][2].plot(women_degrees['Year'], women_degrees[other_cats[oth]],
        c=cb_dark_blue, label='Women', linewidth=3)
    ax[oth][2].plot(women_degrees['Year'], 100-women_degrees[other_cats[oth]], 
        c=cb_orange, label='Men', linewidth=3)
    ax[oth][2].spines["right"].set_visible(False)    
    ax[oth][2].spines["left"].set_visible(False)
    ax[oth][2].spines["top"].set_visible(False)    
    ax[oth][2].spines["bottom"].set_visible(False)
    ax[oth][2].set_xlim(1968, 2011)
    ax[oth][2].set_ylim(0,100)
    ax[oth][2].set_title(other_cats[sp])
    ax[oth][2].tick_params(bottom="off", top="off", left="off", right="off")

ax[0][0].text(2005, 87, ‘Men’)
ax[0][0].text(2002, 8, ‘Women’)

ax[5][2].text(2005, 62, ‘Men’)
ax[5][2].text(2001, 35, ‘Women’)

plt.show()

I was also struggling with grouping the degrees into columns and after spending more than hour looked at the solution. Thank you Rucha for explanation. However, I totally agree with burnsdillion on this. How are we suppose to get to this solution if this was not covered in the course? Unless I am missing something. It’s just a bit frustrating.

The same situation. I couldn’t make a grid, for 3 hours, looked at previous materials