Ebay Car Sales data guided Project: `SettingWithCopyWarning` for mapping column values from German to Enlish

Screen Link:

My Code:

1. autos.loc[:,"seller"].replace({"privat":"private"}, inplace = True)

2. autos.loc[:,"seller"] = autos.loc[:,"seller"].map({"privat":"private"})

What I expected to happen:

I am trying to change the column `seller` in the 'Ebay Car Sales Data'. The column `seller` has rows with values 'privat'  in German and I want to change it  to 'private' in English.

What actually happened:

It changes the row values from 'privat' to 'private' in the column `seller`. However it throws a warning:

/dataquest/system/env/python3/lib/python3.8/site-packages/pandas/core/indexing.py:966: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  self.obj[item] = s

I tried by using the 1. Series.replace() and 2. Series.map() and in both the ways I got the warning.

Thank You for your Guidance
Asim

1 Like

Hi @asim.choudhury,

You can try using .copy() after the end of your second line of code e.g. autos.loc[:,"seller"] = autos.loc[:,"seller"].map({"privat":"private"}).copy().

Also consider giving the following a read as well:

The above explains in detail what the warning is about, why your code can still run even though you have a warning (in contrast to an error), and how to fix it.

If .copy() doesn’t work, share your current project in your post or a reply, and I and someone else in the forum can take a closer look. Here’s a guide on how to do so:

–––

On another note, you can also give your post a more specific subcategory. Under Q&A, you can assign the DQ Courses subcategory which fits your question. Here’s how to recategorize a post:

1 Like
autos.loc[:,"seller"] = autos.loc[:,"seller"].map({"privat":"private"}).copy().

I tried the method by putting .copy() at the end but I am still getting the warning.

Now, I have created a new post while sharing my project under the Projects → Share category.

Thank You for all the guidance.

2 Likes

I see.

It’s probably some code earlier in the project that causes the warning. I’ll have a look at your project and I’ll see if it’s something I can help with.

2 Likes

@asim.choudhury,

I had a look at your code and the following is the likely culprit:

#Removing rows with price less than $100.00 and 
#more than $10,000,000 .
autos = autos[autos["price"].between(101, 9999999)]
autos["price"].value_counts().sort_index(ascending = False)

As mentioned in the previous guide I previously linked to, when you are assigning the result of a single get operation e.g. data frame indexing, you’ll need to use a .copy() or else, there’s going to be hidden chaining when you want to assign it with other values. In this case, internally, autos is not just autos but it’s a reference of autos[autos["price"].between(101, 9999999)]so the following code for example:

autos.loc[:,"seller"] = autos.loc[:,"seller"].map(map_seller)

is actually something like this:

autos[autos["price"].between(101, 9999999)].loc[:,"seller"] = autos.loc[:,"seller"].map(map_seller)

The above is an instance of chaining: first operation is autos[...] and the second operation is .loc[...]. The thing is autos can either be a copy or a reference (in this case, a reference) but pandas doesn’t know which one it is. pandas is cautious and it doesn’t like people setting a value on a copy so to be safe, it assumes that autos is a copy and warns the user to make it explicit that it’s a copy or if it’s a reference, then the code needs to be modified a bit which I’m not going to cover here. Thus, it prefers that an explicit .copy() is made early on when autos was assigned with autos[autos["price"].between(101, 9999999)].

An example modification:

#Removing rows with price less than $100.00 and 
#more than $10,000,000 .

# MODIFIED
autos = autos[autos["price"].between(101, 9999999)].copy()
autos["price"].value_counts().sort_index(ascending = False)

Try modifying your code as I shown above and see it fixes the problem.

1 Like

It did solve the problem after the modification you said.

But, I don’t fully comprehend What I did by the modification. So, I will read the page that you shared before and or any help that I can find online to understand it.

Thank You so much for your kind help :slight_smile:

2 Likes

Yeah, it’s a tricky one to understand. I don’t understand it that much either to be honest. And from reading about it online, some experienced programmers also don’t understand it that well (some people just silence the warning and move on) and preferred that pandas handled the warning a different way.

To be safe, you’ll just need to be aware when you’re using a pattern like so:

df = df[…] or whatever method chaining with . like .loc or .map.

Unless you know what you’re doing, when the warning appears, just put a copy the first time the above code pattern appears and not when the warning starts appearing (optionally: don’t add it if you don’t intend to do any value assignments later but if that’s too much hassle, just use copy in all cases).

2 Likes

Thanks a ton.

I have just completed the guided project along with all the ‘Next Steps’ and I am going to upload it next for any suggestions or guidance.

Regards

2 Likes