Hi @princefame1
you are getting this error because you can’t covert the string '3.0M' to float, form me to resolve the problem I removed the row containing this review value when I was cleaning the data because there is missing data in this row
Hi,
I’ coming across the same issue. I did delete the row data as instructed in slide 3. My code also matches the solution code posted on github. However, the same error still appears.
I’ve refreshed my screen, restarted and re-run the kernel but i’m still coming across the same issue. I’ve re-ran the code in a fresh python notebook to make sure that it it’s not some error in the earlier lines of code but still the same error.
I am trying to delete the row that has bad data aka row number 10472. However, even after i drop that row, i still see the same record. Am I missing something here ?
How do i delete this row from the data set ? Thanks for your help
Hi @vkalyanraman. Try either data = data.drop(data.index[10472]) or data.drop(data.index[10472], inplace=True) so that the results are saved back to the dataframe.
Why do you have to assign it back the reference to the dataframe. I thought when you call a method on the object, you are actually changing the object. I don’t understand why the reference need to be updated. Is this how it works in python ? Say, you delete an element in a list and you still have to assign it to a reference for it to take effect ?
DataFrame.drop() by default (inplace=False) returns a new copy with the operation performed. It doesn’t change the object unless we change the inplace parameter, or we save the copy to the same variable. I found this article to be an interesting read that goes more indepth:
This thread on Stack Overflow had some discussion about whether or not to use the inplace parameter in case you were interested in that as well:
Also, the reference material was suggesting to use the “Del data[10472]” and it was giving an error in my code. I tried doing Del data[10472:10473] and it was not working as well. Is this not working because of the same reason ?
I think it has to do with the nature of the pandas library? In the original project we read the dataset as a list of lists (not dataframe objects), and del works just fine on lists and dictionaries (here’s some examples). del can be used to delete column in a dataframe, but I haven’t seen that for rows. I found this thread on Stack Overflow that explained it this way:
You can’t remove a row with del as rows returned by .loc or .iloc are copies of the DataFrame, so deleting them would have no effect to your actual data.
I know this has been discussed but can I just keep in a line of del so that every time the code is executed it gets deleted but is present in original file.?
Hi @Prem, I am getting the same error as above. I have deleted the android data set row ‘10473’ which contains the value ‘3M’ under the ‘Reviews’ column. And after that, added the code to create a new dictionary as given in the solution. Could you please let me know why I am running into this problem? I have attached the notebook below. Thanks, much appreciated! Data_Analysis_Project.ipynb (11.1 KB)