Profitable App Profiles - header row

In the exercise here:

https://app.dataquest.io/m/350/guided-project%3A-profitable-app-profiles-for-the-app-store-and-google-play-markets/5/removing-duplicate-entries-part-two

it was stated :

Loop through the Google Play data set (make sure you don’t include the header row).

However I do not see where this is being achieved in this function; if somebody could advise please? ‘Name = app[0]’ looks to me like it is taking from the header row, so I am a bit confused.

duplicate_apps = []
unique_apps = []

for app in android:
    name = app[0]
    if name in unique_apps:
        duplicate_apps.append(name)
    else:
        unique_apps.append(name)
    
print('Number of duplicate apps:', len(duplicate_apps))
print('\n')
print('Examples of duplicate apps:', duplicate_apps[:15])

This function:

def explore_data(dataset, start, end, rows_and_columns=False):
    dataset_slice = dataset[start:end]    
    for row in dataset_slice:
        print(row)
        print('\n') # adds a new (empty) line after each row

    if rows_and_columns:
        print('Number of rows:', len(dataset))
        print('Number of columns:', len(dataset[0]))

It was stated:

dataset shouldn’t have a header row, otherwise the function will print the wrong number of rows (one more row compared to the actual length).

The ‘dataset’ being an input parameter to the function however I do not see how the header is being removed from the dataset here. I had seen it achieved in exercises by slicing [1: ] ---- [beg:end] so not starting from index[0] but do not see how it is being done here by default. Does it mean that you should always supply the number ‘1’ as input parameter to ‘start’ here?
def explore_data(dataset, start, end, rows_and_columns=False):

Thanks
JB

You removed the header row in the beginning of the project, with this:

from csv import reader

### The Google Play data set ###
opened_file = open('googleplaystore.csv')
read_file = reader(opened_file)
android = list(read_file)
android_header = android[0]
android = android[1:]

### The App Store data set ###
opened_file = open('AppleStore.csv')
read_file = reader(opened_file)
ios = list(read_file)
ios_header = ios[0]
ios = ios[1:]

The following code removes the header ofeach dataset:

android = android[1:]
ios = ios[1:]

Everything you do from now on using the ios and android is already without the headers.

I see! Many thanks Otavios.

1 Like

You’re welcome. If it solved your problem, please mark as solution.

Hi , okay. I marked it now.