Profitable App Profiles

Hi everyone,

Fiinally my first project done on Dataquest (and my second one posted here) is ready to be submitted :relaxed:.

I found this project very interesting, but at the same time quite challenging, with my insights to be open for discussion (mine are different from those in the solution notebook, but I still have doubts about both of them :thinking:).

Please take a look at my code and findings and let me know what you think about them. Any comments will be very valuable: code improvements, styling, storytelling, some eventual errors.

Thanks a lot in advance!

Link:
https://app.dataquest.io/m/350/guided-project%3A-profitable-app-profiles-for-the-app-store-and-google-play-markets/14/next-steps

Jupyter:
Profitable_App_Profiles.ipynb (82.5 KB)



Click here to view the jupyter notebook file in a new tab
3 Likes

Hey, Elena!

Reads like a book, nice work :+1:

Some minor details I would add:

  • code cell #2. The 5 lines of code repeated twice with only difference in data set name. You might consider put the code into a function read_data() or something like this to avoid code repetition.
  • code cell #3. Try to avoid magic numbers in function calls like explore_data(android, 0, 3, True) . For readability, it’s better to use named arguments and explicitly indicate what is 0, what is 3, what is True.
  • code cell #25. Same goes here - what is 7? why 7 in android and 4 in ios? What if someone creates additional columns in this data? All these magic numbers will shift and no longer work. You might consider creating variables/constants with clear names and put them somewhere at the start of your project. If something changes, you need to change it only in one place.
    for app in android_cleaned_filtered:    
        if app[7] == '0':
            android_final.append(app)
            
    for app in ios_filtered:
        if app[4] == '0.0':
            ios_final.append(app)
  • cell code #22. You can skip the part if check == True: and put it simply if check:
  • cell code #27. Very nice :wink:
  • cell code #34. You can use method-chaining number_installs = number_installs.replace('+', '').replace(',', '') to avoid repetition or use regular expressions which will be covered later.
  • cell code #36, #38, #40, #42, #44, #46, #48 etc look very similar and repetitive. Is there a way to create a function for this?

Great project!

3 Likes

Hi @lostmachine,

Thank you very much for your essential comments and attention to my work, you helped me a lot! I agree with everything and will introduce all these improvements. My project is already quite long, no need to prolong it with repetitive pieces of code :smile:

2 Likes