Code comprehension (for loops)

Hi, i am struggling a bit to understand how this is working.

duplicate_apps = [] # here we create an empty list to store duplicate apps
unique_apps = [] # here we create an empty list to store all the apps - without duplicates

for app in android: # so this for loop creates a variable called "app" and loops over the entire android dataset
    name = app[0] # name is just a variable, we assigning it to app and creating a list starting from 0 index
    if name in unique_apps: # so when in the code has name gone into unique_apps? does this if statement put the data set into
        # the uniue _apps? 
        duplicate_apps.append(name)
    else:
        unique_apps.append(name)
    
print('Number of duplicate apps:', len(duplicate_apps))
print('\n')
print('Examples of duplicate apps:', duplicate_apps[:15])

I actually understand the whole code and what it does, its more this bit:

if name in unique_apps

When i read this it took about 5 minutes to understand how its all linked, i had the same issue when i was learning about for loops. Its saying if the name is unique apps - but when did we even put it in? does the if statement put all the data in the unique_apps list?

I know im new and im sure its just me, but in my mind having the code something like this, makes sense:

put name in unique_app
if the name is allready in unique app
append name to duplicate_apps

I know its basic stuff but sometime its like reading a riddle. Its like this in my brain:

If i is in p , then p is b, put p into i, then i into d , then append p into i, and loop over d. 

I think im having an off day here.

1 Like

Hi @Frankie,

Another way to think of the logic for that section is thinking in terms of “Have we seen this app before?”.

You wrote “but when did we even put it in?” and that’s interestingly how it works. The unique_apps is like a newborn baby who starts out not knowing anything, gets bored easily and is only interested in new things. In that sense, if name in unique apps is effectively filtering out any app that it has seen before:

unique_apps = []
duplicate_apps = []

# Say this is the first time Facebook appears
#
# Have we seen Facebook before?
if "Facebook" in unique_apps:
    duplicate_apps.append("Facebook")
else:
# no, we don't have Facebook in unique_apps -> store in unique_apps
    unique_apps.append("Facebook")

# Say another Facebook app appears 
#
# Have we seen Facebook before?
if "Facebook" in unique_apps:
# Yes, it's already in unique_apps -> store in duplicate_apps
    duplicate_apps.append("Facebook")
else:
    unique_apps.append("Facebook")

So the answer to “does the if statement put all the data in the unique_apps list?” is no, not all data will be included in the unique_apps. Any time an app is seen the second, third, forth time, the if name in unique_apps will say “Yeah, that app is already in unique_apps so please store it in duplicate_apps instead”.

The above pseudocode will probably not restrict unique_app to only have unique apps. If you have two “Facebook” apps for example, you’ll have ["Facebook", "Facebook"] in unique_apps and ["Facebook"] in duplicate_apps.

The unique_apps is in a way discriminatory and works on an “only first-timers are allowed” basis. duplicate_apps instead is a refuge for those rejected by unique_apps i.e. second, third, forth-timers etc.

2 Likes

Hi Wan.

That explanation is amazing. And yeah that clicked. It’s actually pretty simple really. I’m going to spend some time playing around with this when I get home.

I’m heading towards the pandas section, so just want to make sure I a have a solid understanding of how things work, so this is valuable!

Thanks Wan.

2 Likes

No worries @Frankie.

Have fun in the pandas section. Cheers.

1 Like

Hey Wan - Ok I got home and read over what you wrote again and had a play around and its sunk in :slight_smile:

its quite cool, I see you could do it with a second data set, i guess that code could be turned into a function as well and be used across all projects. Quite exciting!

dataset = ["Facebook", "Instagram", "TikTok", "Reddit", "Discord", "Discord", "Instagram"]
print(type(dataset))

unique_apps = []
duplicate_apps = []

name = theapp[0]
for theapp in dataset:
    if theapp in unique_apps:
        duplicate_apps.append(theapp)
    else:

        unique_apps.append(theapp)

print (unique_apps)

# Output: ['Facebook', 'Instagram', 'TikTok', 'Reddit', 'Discord']

print (duplicate_apps)

# Output: ['Discord', 'Instagram']

dataset2 = ["duck", "bird", "cat", "dog", "horse", "bird", "duck"]

name = theapp[0]
for theapp in dataset2:
    if theapp in unique_apps:
        duplicate_apps.append(theapp)
    else:

        unique_apps.append(theapp)

print (unique_apps)

# Output: ['Facebook', 'Instagram', 'TikTok', 'Reddit', 'Discord', 'duck', 'bird', 'cat', 'dog', 'horse']

print (duplicate_apps)

# Output: ['Discord', 'Instagram', 'bird', 'duck']




Thanks again Wan!

2 Likes

Yeah, that’s actually a common thing programmers do. Other than creating functions, you can also take some section of code or logic that are frequently used in many projects, put them into a separate script file, and then just bring that script to any future projects.

Similar to the first guided project, you can create your own helper.py (or under any other name) script that you can import in your other projects i.e. import * from script_name.

1 Like