Communicating Results: Removing Duplicates

Screen Link: https://app.dataquest.io/m/467/communicating-results/4/removing-duplicates

My Code:

paid.sort_values(by="Reviews", ascending=False, inplace=True)
paid.drop_duplicates(subset="App", keep="first", inplace=True, ignore_index=True)
print(paid.duplicated("App").sum())

What I expected to happen:
I expected this would sort paid by the Reviews column, drop duplicates, resetting the index, and then I’d print the sum of duplicated apps.

What actually happened:

TypeError:drop_duplicates() got an unexpected keyword argument 'ignore_index'

When I looked at the documentation, it seemed to indicate that the ignore_index would reset the index. However, apparently it’s not even a recognized keyword. I know that Pandas was upgraded to 1.0.0 back at the end of January. Is the ignore_index keyword new?

Edit:
Just checked the January 29, 2020 Release Notes for 1.0.0. Turns out that ignore_index is indeed a new addition. Glad to know I’m not crazy.

1 Like

hi @ChrisMatsuoka

thanks for this question. made me realize the importance of instruction 4 to reset the index for this exercise.
I tried to check on stack overflow - only checked a few posts but they were also different.

You are not crazy! Perhaps Nerdy!!

3 Likes