# Code mistake and question on randomizing

My Code:
``````# Brought along the changes we made to the `dc_listings` Dataframe.
stripped_commas = dc_listings['price'].str.replace(',', '')
stripped_dollars = stripped_commas.str.replace('\$', '')
dc_listings['price'] = stripped_dollars.astype('float')
dc_listings = dc_listings.loc[np.random.permutation(len(dc_listings))]

def predict_price(new_listing):
temp_df = dc_listings.copy()
## Complete the function.
temp_df["distance"]=temp_df["accommodates"].apply(lambda x:np.abs(x-new_listing))
temp_df=temp_df.sort_values("distance")
predicted_price=nearest_neighbors.mean()
return(predict_price)

acc_one = predict_price(1)
acc_two = predict_price(2)
acc_four = predict_price(4)

print(acc_one)
print(acc_two)
print(acc_four)
``````

What I expected to happen:

What actually happened:
It says acc_two is not defined. same for the others

Also, why do we use loc when doing
dc_listings = dc_listings.loc[np.random.permutation(len(dc_listings))]?
what would happen if we didn’t use loc? it seems that iloc also works. Whats the difference and why does that also work

What was the point of np.random.seed(1)? Was it ever used? could we have used 3 instead of 1?

``````Replace this line with the output/error
``````

Your code is correct, but it seems like the platform expects us to use the variable name `mean_price` to assign the mean price to. You can check the variable inspector and see that it doesn’t list the `acc_one`, `acc_two`, `acc_four` variables. Hence, the error. So, use `mean_price` instead of `predicted_price` in your code and check whether it works.

We use `loc[]` to return a new Dataframe containing the shuffled order.
`loc[]` is label based data selection method which means that we have to pass the name of the row or column which we want to select.
`iloc[]` is a indexed based selection method which means that we have to pass integer index in the method to select specific row/column.

We can use any of the two that suits our requirement. Both the methods work here because, the index column has integer data.

The `seed()` method is used to initialize the random number generator.
The random number generator, we use here, needs a number to start with (a seed value), to be able to generate a random number.
DQ platform expects us to pass 1 for answer checking purposes.
The seed(1) method used here helps us to generate an array of shuffled numbers (through `np.random.permutation(len(dc_listings))`) that is the same as the one used by DQ. Again, this is to validate the answer.

Hope its clear now.
Thanks.

Ok .
So in the same page they have print(dc_listings[dc_listings[“distance”] == 0][“accommodates”])
When I try print(dc_listings.loc[dc_listings[“distance”] == 0][“accommodates”]) I also get the same thing. Why is that? Could it be that a series is just a trivial data frame?
However dc_listings = dc_listings.loc[np.random.permutation(len(dc_listings))] needs the loc.

For the seed method what would a number besides 1 mean?

The `dc_listings` doesn’t have a column named `distance` in it.

will return a permuted range. Each time you run this piece of code you’ll get a different sequence.
`seed()` will seed the generator. When you define `seed()` before using the `permutation()` method, the permuted sequence will be the same every time you run the below code

``````np.random.seed(3)
np.random.permutation(len(dc_listings))
``````

You can pass any number to `seed()`. The number you pass will be the initial value used by the pseudorandom number generator.
When you use `1` in the `seed()` method, the sequence generated is the same as the one used by DQ.

Your return statement says predict_price instead of predicted_price. Can you check and confirm.
I cannot pinpoint any error with the code otherwise