Why do we need the .copy() method here? Does it have an actual affect on the code?

hn_sql = hn[hn[‘title’].str.contains(r"\w+SQL", flags=re.I)].copy()
hn_sql[“flavor”] = hn_sql[“title”].str.extract(r"(\w+SQL)", re.I)
hn_sql[“flavor”] = hn_sql[“flavor”].str.lower()
sql_pivot = hn_sql.pivot_table(index=“flavor”,values=“num_comments”, aggfunc=‘mean’)

hn is the original dataframe, and it doesn’t get affected by hn_sql(non-copied version), even if you change some of the values in hn_sql

1 Like

Hello @hqiantan06, welcome back to the Community.

copy() is used to make a copy of an objects indices and data. We copy the data to hn_sql so that any modification to hn_sql will no modify the original DataFrame (hn)

Have a look at this: :point_down:

1 Like

In addition to @info.victoromondi’s answer, .copy() can help avoid the SettingWithCopy warning.

This was covered in the content in some detail, but this blog post is also pretty helpful - https://www.dataquest.io/blog/settingwithcopywarning/

Also, for any future questions, please make sure to also include the link to the Mission/Mission Step you are referring to.

1 Like