I'm Getting Incorrect Numbers for Numbers of Apps

My Code:

final_apple = []
final_google = []

for app in apple_english:
    price = app[4]
    if price == '0.0':
for app in google_english:
    price = app[7]
    if price == '0':

What I expected to happen:
As per the solution key, I did not get the correct numbers. I was expecting 3222 apps for the final apple list and 8864 apps for the final android list.

What actually happened:


I checked my numbers to ensure I wasn’t performing analysis on inaccurate numbers. I’m not sure why there was such a drastic difference in my numbers and the answer key numbers. Any advice helps. Thank you!

Probably there might be 2 things to consider why your results are coming out different:

  1. First, check how many total cleaned apps (which means the removal of duplicate app entries, inaccurate data and non-english text) are there in your defined variable apple_english and whether it’s same as per solution workbook
  2. If not same (for above step) then it might be possible we have got a little different dataset for Apple and Google in our Jupyter’s notebook and maybe solution workbook used other datasets

Will leave up to Experts to confirm this. I could be wrong here :slight_smile:

Thank you for the advice. I did look at the previous numbers after reading this. Basically all of my numbers are different starting at the point where you find the number of English language apps (my list of the Google data after removing the duplicate entries is definitely correct).


Hi @dheavennn,

If you are still getting this output, please share with us the jupyter notebook file (.ipynb).

Hi, I’m getting a mismatch in the number of free apps for Android. The solution book returns 8,864 apps that are free, but in my book I’m getting 9,614, which is the same as the number of apps that are in English. I checked the logic of my for loop for checking for free apps and it matches with the solution book but still not match.

Could it be that the dataset has been updated? Thanks.

For the android apps I’m not certain why you are getting an off number but for the apple data I can tell you exactly what is wrong. I had the same number of apple apps (4056) occur and it was because I didn’t double check where the app name index number was. For the android set you would write:

for app in android_apps:
         name = app[0]

however, if you look at the column headers, the app id is the first indexed column, the app name is second, so the code should be:

for app in ios_apps:
         name = app[1]

That will fix the error and hopefully helps anyone else looking for what happened. This is what careless copy and pasting of your own code can bring so be careful, I learned the hard way!

