[error/typo?] Guided Project: App Store & Google Play markets

Doesn’t seem like the ticketing system is working for me to submit a Typo through the website (keep getting "Unable to process your request)

Issue location

  • Lecture: Introduction to Python
  • Lesson: Guided Project: Profitable App Profiles for the App Store and Google Play Markets
  • Slide: 6

In the first paragraph, it mentions not needing to remove duplicates in the iOS dataset, but in my searching I found two duplicates.

The paragraph:
In the previous step, we managed to remove the duplicate app entries in the Google Play data set. We don’t need to do the same for the App Store data because there are no duplicates — you can check that for yourself using the id column (not the track_name column).

data header:
[‘id’, ‘track_name’, ‘size_bytes’, ‘currency’, ‘price’, ‘rating_count_tot’, ‘rating_count_ver’, ‘user_rating’, ‘user_rating_ver’, ‘ver’, ‘cont_rating’, ‘prime_genre’, ‘sup_devices.num’, ‘ipadSc_urls.num’, ‘lang.num’, ‘vpp_lic’]

The data I’m seeing, where I’m looking at column index 1 for “track_name”:

  1. INDEX: 2948 :: [‘1173990889’, ‘Mannequin Challenge’, ‘109705216’, ‘USD’, ‘0.0’, ‘668’, ‘87’, ‘3.0’, ‘3.0’, ‘1.4’, ‘9+’, ‘Games’, ‘37’, ‘4’, ‘1’, ‘1’]
  2. INDEX: 4463 :: [‘1178454060’, ‘Mannequin Challenge’, ‘59572224’, ‘USD’, ‘0.0’, ‘105’, ‘58’, ‘4.0’, ‘4.5’, ‘1.0.1’, ‘4+’, ‘Games’, ‘38’, ‘5’, ‘1’, ‘1’]

  1. INDEX: 4442 :: [‘952877179’, ‘VR Roller Coaster’, ‘169523200’, ‘USD’, ‘0.0’, ‘107’, ‘102’, ‘3.5’, ‘3.5’, ‘2.0.0’, ‘4+’, ‘Games’, ‘37’, ‘5’, ‘1’, ‘1’]
  2. INDEX: 4831 :: [‘1089824278’, ‘VR Roller Coaster’, ‘240964608’, ‘USD’, ‘0.0’, ‘67’, ‘44’, ‘3.5’, ‘4.0’, ‘0.81’, ‘4+’, ‘Games’, ‘38’, ‘0’, ‘1’, ‘1’]

The code I’m using to find the duplicates:

def find_dups (data_set, header, data_name, col_index):
    print("Processing data from: " + data_name)
    dups = {}
    found = {}
    dups_were_found = False
    for row_index in range(0,len(data_set)):          
        col_value = data_set[row_index][col_index]

        if col_value in found.keys():
            # Keep track of dups, including unique
            # Will determine how to weed out dups later
            dups_were_found = True
            if col_value in dups.keys():
                dups[col_value] = [found[col_value], row_index]
            # Keep track of found items only
            found[col_value] = row_index
    if dups_were_found: 
        return [col_index, dups]

ios_dups = find_dups(ios_data,ios_header,  "ios", 1) 

hi @Thomas.Dawsey

Welcome to Dataquest Community!

This question is same/ similar to the below listed topics as well. I don’t remember the details of this project, but I guess they do have differences in app ID and version details. Perhaps that’s why considered as different.

hope that helps.

1 Like

I think this discussion on the source of the dataset might be relevant here - https://www.kaggle.com/ramamet4/app-store-apple-data-set-10k-apps/discussion/90409

Those apps might not really be duplicates.

But I will also let someone from DataQuest to provide an official answer.


I didn’t think of looking into Kaggle despite it being pointed to numerous times. I also wasn’t comfortable enough yet to navigate the Community confidently.

Thank you both for the replies! @Rucha and @the_doctor
This does answer 99% of my questions. I will push forward

1 Like

Thank you @the_doctor for finding that discussion. The apps are not duplicates because publishers are different.

While it may have been possible in the past. It is no longer possible.

No, the name of your app as it appears in the App Store must be unique. When creating your app, you will receive the following error if you try to use a name that is already taken:

The Application Name that you provided has already been used. Please provide a unique Application Name.

But the name on the icon can be whatever you want - it can be the same as another app’s icon name.
Can my iOS app have the same name?