# What does the i mean in the loop?

Hi all! I have a little confusion here. The more I learn the more confusion I have. We use strata strategy to take samples proportionally then calculated a proportional sampling means to compare with the sample means. `for i in range(100)` which means that we take 10 samples 100 times, correct? i means that how many times we take the sample. but what does `random_state=i` mean here?

Thank you!

My Code:

``````wnba['Games Played'].value_counts(bins=3,normalize=True)*100
groupover23=wnba[wnba['Games Played']>22]
group13_22=wnba[(wnba['Games Played']>12) & (wnba['Games Played']<=22)]
groupless12=wnba[wnba['Games Played']<=12]

proportional_sampling_means=[]
for i in range(100):
sample_over_23=groupover23['PTS'].sample(7,random_state=i)
sample_btw_13_22=group13_22['PTS'].sample(2,random_state=i)
sample_under_12=groupless12['PTS'].sample(1,random_state=i)
final_sample=pd.concat([sample_under_12,sample_over_23,sample_btw_13_22])
proportional_sampling_means.append(final_sample.mean())

plt.scatter(range(1,101),proportional_sampling_means)
plt.axhline(wnba['PTS'].mean())

``````
1 Like

You have asked a similar question before - When to use random_state=0

To which you got the following responses (I added the highlights to point out the core of the response) -

We can use `random_state` to reproduce same output every time. Here, In context of sampling we are using `pd.DataFrame.sample` that return a random sample of items from data frame. (Without setting `random_state` Every time it will return different results or sample.)

But if we want to re-produce same output each time say for testing purpose then we take use of `random_state` .

and

With `random_state=123` (We can set any integer number to `random_state` .)

@DishinGoyani also provided you with code examples on what happens with and without `random_state`.

Based on the above, try to think through the following -

• For each iteration of your `for` loop, what would `i` be.
• What exactly happens when you set `random_state` to a specific value.

Based on that we can continue forward with where you might be getting stuck conceptually. But first, try to think of the above two points and share your response.

3 Likes

For my understanding, what matters to the `random_state=?` is `None` and `integer`.If we use `none`, which means the sample we generate is different every time. If we use the same one `interger` every time we will have the same samples every time, It does not matter what integer we chose to use at the very beginning.

For example, we need to generate samples 100 times. If we do `random_state=456`, then in this 100 times we have to use `456` every time to generate samples. Samples will be exactly the same. we can’t use `random_state=1` at the first 10 times, then use`random_state=456` at the 11th times. correct?

• For each iteration of your `for` loop, what would `i` be.------i means the times we take a sample from 100?
• What exactly happens when you set `random_state` to a specific value.----if we set a specific value every time, the sample will be exactly the same as last time, correct?
1 Like

That’s absolutely correct.

Not quite correct. You don’t have to use a specific number all throughout those 100 times. You can choose differently. It depends on what you are trying to do with those samples each time. That’s what this exercise is doing.

No. `i` means that we are sampling our data 100 times. We are not taking a sample from 100, we are taking samples from our data set 100 times.

Yes, that’s correct.

Now, for each iteration of our `for` loop, we set the `random_state` to be `i`. So, for each iteration of our loop, that is for each time we sample from our data set, we use the following -

The above gets us samples from our data. And for each value of `i` we will get different samples for the above for each iteration. Because the value for `random_state` will be different for each iteration.

If we run our code twice, for both of those, the samples chosen for, let’s say, the iteration `15`th would be the same for both those runs. Because the `random_state` for the 15th iteration will be the same and therefore will generate the same samples.

So, if I was to take your code, and run it myself I would get the exact same results (within the DQ platform). This helps with reproducibility of results (and helps DQ with checking our work as well through their grader).

Hopefully, now it starts to make better sense.

4 Likes

completely got it now! Thank you！！

Hello @the_doctor,

If I didn’t misunderstand your idea,then with this code :

``````under_12 = wnba[wnba['Games Played'] <= 12]
for i in range(100):
sample_under_12 = under_12['PTS'].sample(1, random_state = i)
``````

I expected to see 100 numbers, (because we sample 100 times.) but when I ran it, I received only 1 value like this:

Can you please explain this for me.
You are sampling 100 times, but you are storing the sample for each iteration in the same variable. So, after each iteration the value stored in the variable, `sample_under_12`, gets replaced with the new sample.