285-10 Need Help To Understand Below Code

Screen Link: https://app.dataquest.io/m/285/frequency-distributions/10/readability-for-grouped-frequency-tables

Code: gr_freq_table_10 = pd.Series([0 for _ in range(10)], index = intervals)

What I expected to happen: The first class interval starts at 0 (not included).

Need explanation of this code. How does it ensure that 0 is not included in the first class interval.

The inclusion/exclusion of 0 in the first interval isn’t defined in the line of code that you posted. It is defined in the definition of intervals:

intervals = pd.interval_range(start = 0, end = 600, freq = 100)

The function pandas.interval_range returns a list (not a Python list, but a list-like object) of intervals.

Let’s read from the documentation:

Notice that the closed parameter is, by default, right, which means that the intervals are open on the left-end-point and closed on the right-end-point. That’s why 0 (which is the lowest of values in any interval) is not included; it is the left-end-point of the first interval.

I hope this helps.

Got it. Thanks.
But then I need to know what the _ means in the below code. Is it like i in the code
for i in range(10)?

Exactly! The variable name _ is used when you don’t care about the actual variable.

In this example we want to create a list with ten elements all of which are zero. To this end, it doesn’t really matter if we use i or not, both [0 for i in range(10)] and [0 for _ in range(10)] work, because the expression 0 doesn’t use the variable at all.

In these situations, by convention, we usually use _. Technically it is not necessary, it’s just a Python-community’s agreement to use _ when we don’t care about the variable.

1 Like

Thanks for the clarification @Bruno.

1 Like

@Bruno, why do we need to .loc here for the series within the for loop? I thought we could use the shorthand of dropping the .loc for referring to a single item from a series, but I run into an error when I drop it.

for value in points:
for interval in intervals:
    if value in interval:
        new_series**.loc**[interval] += 1
        break

Hey, Alex.

Please ask this in a new post, your question is completely different from the one asked here.