Hi everyone,
Fiinally my first project done on Dataquest (and my second one posted here) is ready to be submitted
.
I found this project very interesting, but at the same time quite challenging, with my insights to be open for discussion (mine are different from those in the solution notebook, but I still have doubts about both of them
).
Please take a look at my code and findings and let me know what you think about them. Any comments will be very valuable: code improvements, styling, storytelling, some eventual errors.
Thanks a lot in advance!
Link:
https://app.dataquest.io/m/350/guided-project%3A-profitable-app-profiles-for-the-app-store-and-google-play-markets/14/next-steps
Jupyter:
Profitable_App_Profiles.ipynb (82.5 KB)
Click here to view the jupyter notebook file in a new tab
3 Likes
Hey, Elena!
Reads like a book, nice work 
Some minor details I would add:
- code cell #2. The 5 lines of code repeated twice with only difference in data set name. You might consider put the code into a function
read_data()
or something like this to avoid code repetition.
- code cell #3. Try to avoid magic numbers in function calls like
explore_data(android, 0, 3, True)
. For readability, it’s better to use named arguments and explicitly indicate what is 0, what is 3, what is True.
- code cell #25. Same goes here - what is 7? why 7 in android and 4 in ios? What if someone creates additional columns in this data? All these magic numbers will shift and no longer work. You might consider creating variables/constants with clear names and put them somewhere at the start of your project. If something changes, you need to change it only in one place.
for app in android_cleaned_filtered:
if app[7] == '0':
android_final.append(app)
for app in ios_filtered:
if app[4] == '0.0':
ios_final.append(app)
- cell code #22. You can skip the part
if check == True:
and put it simply if check:
- cell code #27. Very nice
- cell code #34. You can use method-chaining
number_installs = number_installs.replace('+', '').replace(',', '')
to avoid repetition or use regular expressions which will be covered later.
- cell code #36, #38, #40, #42, #44, #46, #48 etc look very similar and repetitive. Is there a way to create a function for this?
Great project!
3 Likes
Hi @lostmachine,
Thank you very much for your essential comments and attention to my work, you helped me a lot! I agree with everything and will introduce all these improvements. My project is already quite long, no need to prolong it with repetitive pieces of code 
2 Likes