Hello everyone,
Like the title suggests, this is an analysis project on my DataQuest journey. I’m really excited to have finished this project just in time! What’s a better way to send off 2020 than a thorough lookback at my focus of the year?
It has been a year of grief for the world, but here in the DataQuest community, I see people from all over the world trying their best to learn and make progress every day. My incentives to do this project are not only revisiting my journey of learning but more encouraging beginners of this journey by giving them a more thorough picture of what’s ahead.
This project is inspired by the people in this community, especially @otavios.s’s amazing project I hope this is not a problem, but I scraped the Community. I was introduced to Selenium and ChromeDriver thanks to his project. Yes, I also scraped the DQ website to get the full Data Scientist curriculum and I also hope it’s okay… It was a lot of fun with automation powered by ChromeDriver. I also tried out parsing email content for the first time to collect data.
Enough talk, here’s a peek at my project:
My DataQuest Learning Curve
The main motivation for this project is to help beginners and potential learners who want to get a better idea on how much time and work the Data Scientist path in Python on DataQuest is involved. Although keep in mind that the time and effort to finish an online course is highly relevant to personal situations.
Here are the questions that get answered in this project:
 How many days did it take for me to finish this path? (timespan, including intervals I didn’t spend on studying)
 175 days. From June 19th, 2020 to December 11th, 2020.
 What’s my best learning steak and average learning streak?
 My best learning streak was 20 days, and 6.6875 days on average. From my personal experience, it’s important to get into the groove and keep going. I a weeklong break in October and it took another week to get back to the same learning efficiency as before.
 How much time was spent in total?
 Total hours spent in finishing the path was 306.4 hours. This means if I studied 24/7, the path could be finished in roughly 13 days. Instead, it took me 175 days. I’m sure the robots are laughing at us humans.
 How many hours did I spend on average in weeks I studied?
 Assuming I studied 5 days out of a week on average, in the 24 weeks I did study, I would have studied for 120 days. This means I spent 3 hours a day studying on Data Quest on average. That sounds about right, but note that it’s a rough estimation. Plus I did spend quite some time in the community too that’s not counted in this project.
 What’s the average time spent to finish a mission?
 111.43 minutes, in other words close to 2 hours. It looks like it takes a dauntingly long time to finish a mission. But this also includes time spent on guided projects, which are most definitely more time consuming than just learning missions. It’s not uncommon to spend days on a guided project.
 What are the speed bumps in the curriculum?
 Steps 2, 4, 5, 6 took more weeks than others to finish. Among them, Step 2 and 6 have the most number of missions, Step 2 also have the most number of guided projects. That makes Step 4 and 5 the most timeconsuming steps of all. Between the two, Step 4 is more time consuming than Step 5. Which reflects my memory pretty well. In Step 4, the timeconsuming part was SQL, and in step 5, it was the courses on probability.
This project is my personal learning progress analysis from finishing the Data Scientist path in Python on DataQuest. The path consists of 165 missions in total, including 22 guided projects. ^{[1]}
A little context about my personal learning situations:
 I started the Data Scientist path in Python on June 19th, 2020, and finished it on December 11th, 2020. Although I didn’t spend a lot of time in the last two weeks, it’s mostly spent on finishing two last guided projects(counts as 2 missions) and extracurricular projects. That’s probably why I didn’t get any learning progress emails after the last of November.
 I used to be a digital marketing account manager and had close to none coding experiences. I learned Python fundamentals from a data scientist course on Udemy for a couple of weeks right before I decided to switch to DataQuest.
 I finished Andrew Ng’s Machine Learning course on Coursera a few weeks before starting the path. I learned basic Octave during the course.
 I’m currently unemployed so I have a lot of spare time for learning.
The progress data in this project comes from the weekly accomplishment email I get from DataQuest on Mondays if I made progress the previous week. It consists of:

date
: receiving date of the email 
missions_completed
: number of missions completed 
missions_increase_pct
: percentage increase/decrease compare to last week on number of missions completed 
minutes_spent
: minutes spent on learning 
minutes_increase_pct
: percentage increase/decrease compare to last week on minutes spent 
learning_streak(days)
: number of consecutive days spent on learning 
best_streak
: best learning streak
The curriculum data in this project comes from the DataQuest dashboard for the Data Scientist path. It consists of 8 Steps, 32 courses, and 165 missions in hierarchical order.
[1] Although the dashboard shows 149 missions and 31 projects, after scraping the dashboard page, there are actually 165 missions, including 22 guided projects.
Github and nbviewer messed up some of the formatting and have trouble showing the Plotly plots, so here are the Visualizations in this project:
 My learning curve
 Hours spent weekly and the corresponding number of missions completed and the steps they belong to
 Number of missions and guided projects in each learning Step
 Full curriculum table of the Data Scientist in Python path on DataQuest
Apart from answering all the questions at the beginning of this project. I also want to add, to the beginners of this course: what I’ve done in this project is more data collecting, data cleaning, and imputation, which you will learn in the first 4 Steps. That means you will be equipped to do all of this halfway through the course!
@nityesh Again, I hope the scraping won’t be a problem. But I will leave that part out if it is. Speaking of which, the number of missions and projects shown in the dashboard is different from the scraping results. I wonder why? Also, if it’s not inappropriate to ask, I’m wondering how is the learning progress tracked and how is the change rate of missions completed and the minutes spent in the weekly accomplishment email calculated?
P.s. if anyone has more questions regarding this project or the DQ data scientist path, feel free to ask me in the comment or reach me at veratsien@gmail.com. I will try my best to provide an answer.
Click here to view the project. (Note that GitHub and nbviewer messed up some formatting and the plots are not showing. Please let me know if you know a workaround. )
Oh, and happy new year, everyone! Good riddance!
Update:
I managed to show the plotly plots in GitHub with a simple fig.show('svg')
, in case anyone finds it useful. You can also define the desired width
and height
of the output svg. So the link above to the project should show the plots just fine now.