I am finding it hard converting the review column in the guided to a floating pointing number.
Screen Link: https://app.dataquest.io/m/350/guided-project%3A-profitable-app-profiles-for-the-app-store-and-google-play-markets/5/removing-duplicate-entries-part-two
for app in google_data[1:]:
name = app
n_reviews = float(app)
I don’t know what is happening, python is not able to convert my review to a floating point number. I need help.
Below is the error code:
ValueErrorTraceback (most recent call last)
4 for app in google_data[1:]:
5 name = app
----> 6 n_reviews = float(app)
ValueError: could not convert string to float: ‘3.0M’
you are getting this error because you can’t covert the string
'3.0M' to float, form me to resolve the problem I removed the row containing this review value when I was cleaning the data because there is missing data in this row
['Life Made WI-Fi Touchscreen Photo Frame',
'February 11, 2018',
'4.0 and up']
Welcome to our Dataquest Community.
You are getting this error because you missed the instructions of slide 3 of Guided Project: Profitable App Profiles for the App Store and Google Play Markets.
@bahmed21 is correct about the instructions in slide 3
Deleting Wrong Data.
Thanks, it has been solved. And I really appreciate.
I’ coming across the same issue. I did delete the row data as instructed in slide 3. My code also matches the solution code posted on github. However, the same error still appears.
Do one thing, restart the kernel and run all cells then check.
I’ve refreshed my screen, restarted and re-run the kernel but i’m still coming across the same issue. I’ve re-ran the code in a fresh python notebook to make sure that it it’s not some error in the earlier lines of code but still the same error.
Can you upload the notebook?
Just give a look on your deleting row that you are deleting the correct row.
I think you will find your mistake.
I am trying to delete the row that has bad data aka row number 10472. However, even after i drop that row, i still see the same record. Am I missing something here ?
How do i delete this row from the data set ? Thanks for your help
Hi @vkalyanraman. Try either
data = data.drop(data.index) or
data.drop(data.index, inplace=True) so that the results are saved back to the dataframe.
Thanks April! It worked.
I still have a question.
Why do you have to assign it back the reference to the dataframe. I thought when you call a method on the object, you are actually changing the object. I don’t understand why the reference need to be updated. Is this how it works in python ? Say, you delete an element in a list and you still have to assign it to a reference for it to take effect ?
Thanks for the help and clarification!
DataFrame.drop() by default (
inplace=False) returns a new copy with the operation performed. It doesn’t change the object unless we change the
inplace parameter, or we save the copy to the same variable. I found this article to be an interesting read that goes more indepth:
This thread on Stack Overflow had some discussion about whether or not to use the
inplace parameter in case you were interested in that as well:
Thanks a lot April. The article is very helpful.
Also, the reference material was suggesting to use the “Del data” and it was giving an error in my code. I tried doing Del data[10472:10473] and it was not working as well. Is this not working because of the same reason ?
I think it has to do with the nature of the pandas library? In the original project we read the dataset as a list of lists (not dataframe objects), and
del works just fine on lists and dictionaries (here’s some examples).
del can be used to delete column in a dataframe, but I haven’t seen that for rows. I found this thread on Stack Overflow that explained it this way:
You can’t remove a row with
del as rows returned by
.iloc are copies of the DataFrame, so deleting them would have no effect to your actual data.
I know this has been discussed but can I just keep in a line of del so that every time the code is executed it gets deleted but is present in original file.?
Hi @Prem, I am getting the same error as above. I have deleted the android data set row ‘10473’ which contains the value ‘3M’ under the ‘Reviews’ column. And after that, added the code to create a new dictionary as given in the solution. Could you please let me know why I am running into this problem? I have attached the notebook below. Thanks, much appreciated! Data_Analysis_Project.ipynb (11.1 KB)
Click here to view the jupyter notebook file in a new tab
Did you check the notebook after restarting and running all the cells?