Exploring E-bay car sales data

Hi all,
I am attempting the 3rd project and have been getting puzzled and frustrated. For some reason my command:

autos[‘price’]=autos[autos[‘price’].between(10,350000)]

To remove outliers from the price section is copying the first column of data (dates) into the price column instead of doing what I want it too. My command is the first time I have assigned anything new to the price column (apart from removing $'s and ,'s from the prices which works fine) and my later odometer alterations all go ahead fine.

I am genuinely baffled am I being incredibly stupid? Please help

1 Like

Hi @thejacksmall,

Welcome to the Community!

Don’t worry, it’s ok to be confused with a code sometimes. Please try this code instead:

autos = autos.loc[autos['price'].between(10, 350000), :]

Life saver :slight_smile:
I am obviously brand new to coding, if you have the time could I get a brief explanation of why your code works whereas mine doesnt
Thanks!

Sure :slightly_smiling_face: But before it, I think you can write the same piece of code that I sent to you even in a simplier way:

autos = autos[autos['price'].between(10, 350000)]

It does the same as my code above, only that here we avoided loc (because it’s ok in this case also without it).

Now about this code and what exactly it does. Your task here is to re-assign to your dataframe the result of selection of only those rows where price is between 10 and 350000. It’s how the between() works. Look, for example, at this short article, in particular the "Example#“1” (only ignore the argument inclusive = True, since this parameter is already True by default). They divided their code into 2 steps: creating a mask (i.e. what exactly they are going to filter) and applying this mask to the dataframe. In my piece of code, these 2 steps go together. Also, in my code we re-assign the result of applying the mask back to our dataframe, i.e. to autos.