DATA cleaning GoogleStore

Hello guys!
I am cleaning the data for the GoogleStore dataset and while writing a piece fo code I obtained the following:

My Code:

apps_and_reviewsmax= {}
android_clean = []
already_added = []

for app in apps_android:
    name = app[0]
    n_reviews = float(app[2])
    if name in apps_and_reviewsmax and apps_and_reviewsmax[name] < n_reviews:
        apps_and_reviewsmax[name] = n_reviews
    if name not in apps_and_reviewsmax:
        apps_and_reviewsmax[name] = n_reviews
    if n_reviews == apps_and_reviewsmax[name] and name not in already_added:


What I expected to happen was that the results of apps_reviewsmax and android_clean were the same : 9659.
Instead apps_reviewsmax gives 9659
and android_clean gives 8196

I replaced the comparison operator β€˜==’ with the logical operator β€˜is’ and it worked.
I am interested to know why as for my knowledge == means equal ?

so what happened in the loop with β€œ==” ? why it was delating more rows?

thank you!!

Hello @greta.meroni,

The reason you are getting different values is because you assigned the wrong value to the n_reviews variable :

That line of code should be n_reviews = float(app[3])

Since the 4th column (index 3) is where you have the values of Reviews.

Lastly, you’re correct by using the comparison operator ==, it checks for value equality as you have rightly mentioned.

Let me know if this helps.


There you go thanks! stupid mistake.
Was it just a coincidence then? that using IS instead of == was giving me the same number?

How interesting!

thanks a lot!

1 Like

Yes, it is a coincidence because it just satisfies the logic for the code at that instance.

Note that the == operator compares the values of both the operands and checks for value equality. Whereas is operator checks whether both the operands refer to the same object or not.

Also, Kindly mark the reply as a solution if it helped solve your question.

Happy learning @greta.meroni!