Question about assigning back to the dataframe

I apologize if this is a basic question but I’ve been working through Guided Project: Exploring Ebay Car Sales Data & have a general question around when to use .loc to assign back to the dataframe.

Example:

`autos = autos[autos['price'].between(0, 150000)]`

vs.

`variable = (autos['registration_year'] > 1900) & (autos['registration_year'] < 2016)

autos = autos.loc[variable]

autos['registration_year'].describe()`

In the 2nd example, my initial effort was to apply it back via the conventional method of assignment: autos = autos[variable]. I quickly learned that turned the whole dataframe into boolean argument.

My question is, how do you know when to use .loc and the ‘other’ way (I don’t know the correct term).

Thank you!

1 Like

variable is a boolean series
According to my thinkinging i think autos = autos[variable] will be similar to autos = autos.loc[variable]

This usually depend on what you want to achieve. E.g I would want some columns that satisfy the condition you defined in the variable you could use the .loc with the variable to select rows that obey variable and then use it to select the columns you want. Below I’ve included the links for data selection, Learn more.

Indexing and selecting data — pandas 1.1.1 documentation

1 Like

Don’t apologize for asking questions.

Even though your question is general, but it’s better to also include the link to the Mission/Mission Step for others to refer to.

What exactly was the output of

autos = autos[variable]

Because the above and autos = autos.loc[variable] should produce the same output.

1 Like

@info.victoromondi That cheat sheet is awesome! Thank you for sharing - will keep that handy.

@the_doctor It was less related to the specific Mission (was just using that code as an example). I think I’m a little clearer on it now, that cheat sheet will help a lot.

Thank you both!

1 Like