Help - Coding Question - Guided Project: Visualizing the Gender Gap in College Degrees

Screen Link:

Code:

for sp in range(0,18,3):
    cat_index = int(sp/3)
for sp in range(1,16,3):
    cat_index = int((sp-1)/3)
for sp in range(2,20,3):
    cat_index = int((sp-2)/3)

I need help with how people even thought of using these ranges? I understand why after seeing an explanation from someone’s post, but I’m still confused how someone even thought of this?
Can someone please walk me through the step by step thought process of how they came up with solving the 6 row by 3 column grid of subplots for each category?
Thank you!

It’s just working backwards from the desired output.

We are trying to add plots per column. That’s why we use add_subplot().

However, add_subplot() adds a plot based on the location we specify using an index. And what position does an index correspond to?

Well, it’s like a grid - we go from top to bottom, left to right.

  • Psychology is at index/location/position 1.
  • Foreign Languages is at index/location/position 2.
  • Health Professions is at index/location/position 3.
  • Biology is at index/location/position 4, and so on.

So, if we want to add plots per column, we start with the first column. The first plot will be at location 1, and the second would be at location 4.

And that’s what we do -

for sp in range(0,18,3):
    cat_index = int(sp/3)
    ax = fig.add_subplot(6,3,sp+1)

range(0,18,3) gives us the values -

0, 3, 6, 9, 12, 15

fig.add_subplot(6,3,sp+1) creates a 6-by-3 (6 rows, 3 columns) grid. Each column will have 6 plots. We can see that the range(0,18,3) gives us 6 values as we see.

But, from the example I shared above, the location of the plots starts with 1.

That’s why we use sp+1.

The above specifies where the plots will be created in the grid.

But what plots are being created per column? We get that information thanks to the stem_cats list for the first column.

Since we are already inside a for loop, how can we try to utilize it to our benefit so that we can simply iterate over the stem_cats list?

Well, sp updates by 3 every iteration. So, we can simply divide it by a 3 to give us the indices -

0, 1, 2, 3, 4, 5

And we can then use those indices to access elements from stem_cats. That’s what the following -

cat_index = int(sp/3)

helps us with. We then use cat_index in women_degrees[stem_cats[cat_index]] for example.

Note:

It’s not necessary that the above is the most straightforward way or the best way. It is possible you might be able to come up with something better or more efficient. But the process does start with thinking about the end result and working backwards to come up with some basic math to get that end result.

Hi, @laurenk9304. I also struggled a bit with this part because I initially started with just generating the subplots by subplot index (0 to 17) but realized that I was mistaken because the categories were supposed to be plotted by column.

Let me just share my own thought process into creating the for loops to generate the subplots. I started with recognizing the structure of the table and the indices for each subplot which is recreated below:

col 1 col 2 col 3
0 1 2
3 4 5
6 7 8
9 10 11
12 13 14
15 16 17

I then looked at the indices of the items within the category lists, which can be seen below:

['Psychology', 'Biology', 'Math and Statistics', 'Physical Sciences', 'Computer Science', 'Engineering']
['Foreign Languages', 'English', 'Communications and Journalism', 'Art and Performance', 'Social Sciences and History']
['Health Professions', 'Public Administration', 'Education', 'Agriculture', 'Business', 'Architecture']

For STEM:

Psychology Biology Math and Statistics Physical Sciences Computer Science Engineering
0 1 2 3 4 5

For Liberal Arts:

Foreign Languages English Communications and Journalism Art and Performance Social Sciences and History
0 1 2 3 4

For Other:

Health Professions Public Administration Education Agriculture Business Architecture
0 1 2 3 4 5

STEM (First Column)

For the STEM category, I needed to place the subplots at indices 0, 3, 6, 9, 12, and 15.
Here’s how I was supposed to match them:

List index Subplot Index
0 0
1 3
2 6
3 9
4 12
5 15

Hence, I needed to loop through the ranges (0, 16) or the numbers zero to 15 (since the end-point in the range is non-inclusive), skipping three numbers for each iteration. The range I needed was:

range(0, 16, 3)

The range function tells the computer to start at zero (0), then take 3 steps for the next iteration until it reaches 15 (the last integer before 16). It cycles through the numbers 0, 3, 6, 9, 12, and 15 which are exactly the subplot indices I want to draw my STEM plots in.

Next, I had to find a way such that I could access the category list indices correctly. For STEM, since this was the first column, it was fairly straightforward, I just needed to divide it by three. Which means as I was looping through range(0, 16, 3) I would also be able to loop through the numbers [0, 1, 2, 3, 4, 5] which are the indices for the degrees in our STEM list.

Liberal Arts (Second Column)

For the liberal arts, this was the mapping for the list index and the subplot index:

List index Subplot Index
0 2
1 5
2 8
3 11
4 14

This means that I needed to iterate through the subplot indices [1, 4, 7, 10, 13] which means I used the function:

range(1, 14, 3)

Again, the loop will start with one (1), then skip to the third (3) number for the next iteration until it reaches thirteen (13) which is the last integer before our end-point of fourteen (14).

For the category list index, I needed to figure out how to transform the [1, 4, 7, 10, 13] to [0, 1, 2, 3, 4] and i just used the following formula:

(subplot_index - 1) / 3

Other (Third Column)

For the other degrees, I used the same thought process:

List index Subplot Index
0 2
1 5
2 8
3 11
4 14
5 17

This means that I needed to iterate through the subplot indices [2, 5, 8, 11, 14, 17] in order to draw the subplots at the correct and intended positions. The range function and its corresponding arguments are as follows:

range(2, 18, 3)

The loop will start with two (2), then skip to the third (3) number for the next iteration until it reaches seventeen (17) which is the last integer before our non-inclusive end-point of eighteen (18).

For the category list index, I needed to figure out how to transform the subplot indices [2, 5, 8, 11, 14, 17] to the category list indices [0, 1, 2, 3, 4, 5] and i just used the following formula:

(subplot_index - 2) / 3

Other Remarks

The computations for transforming the subplot indices to category list indices will return floats, which is why we wrap those with the int() function.

Thinking about it, there are indeed other ways to build the loops. I can imagine looping through the lists themselves. For example, I could loop through the STEM category list like this (I didn’t bother checking whether these codes actually work but I think they should):

i = 0
for cat in stem_cats:
    subplot_index = i * 3
    ax = fig.add_subplot(6, 3, subplot_index)
    ax.plot(women_degrees['Year'], women_degrees[stem_cats[i]], c=cb_dark_blue, label='Women', linewidth=3)
    # more visualization code here
    ax.set_title(cat)
    i = i + i

OR for the Liberal Arts category:

i = 0
for cat in lib_arts_cats:
    subplot_index = (i * 3) + 1
    ax = fig.add_subplot(6, 3, subplot_index)
    ax.plot(women_degrees['Year'], women_degrees[lib_arts_cats[i]], c=cb_dark_blue, label='Women', linewidth=3)
    # more visualization code here
    ax.set_title(cat)
    i = i + i

To wrap it all up, there are many ways to go about it but we should try to understand the structures of the tables or subplots in terms of indices and positions and work our way backwards. For this particular task, we needed to understand that there were two positional values (list indices and subplot indices) we had access and keep track of. Hope this helps!

Thank you so much for the breakdown! Its so helpful

Thank you for breakdown! Its so helpful