Some welcomed tips and tricks for those starting their data science journey
If you’re like me, you’ve probably spent a fair amount of time on Data Quest, progressing through the material, and on the cusp of finishing one of the various pathways (if not already there). Congrats! Good job! Mazel Tov! You’re probably on your way to start to apply for some data-related positions or whatever. Now is a good time to start to take the next step and start building out your portfolio with a few projects. As someone with a few under my belt already, I figured I could share how to go about this from selection to completion.
"Wait a minute. Couldn’t I just use what I have done so far on DQ for my portfolio"
Sure, you could do that. Maybe do a bit of sprucing up along the way. However, those projects probably aren’t going to get you very far. While there are a few reasons for this, it largely has to do with the fact that it’s a guided project. This means that YOU really didn’t do any of the heavy-lifting in terms of planning, designing the project, or troubleshooting throughout each step. All of this has already been figured out for you and it’s just a matter of you following some laid-out instruction to some relatively laid-out end goal. What people want to see is all of these other qualities and not just your data analysis + coding skills.
“Like you can’t say you’re a really good cook if the most you’ve ever done was follow those meal-kit subscriptions like Blue Apron or HelloFresh. All you did was follow the instructions and not burn your home to the ground”
"OK, I get what you’re saying. So, how do I go about it then?"
Before figuring out what you want to do, it’s a good idea to set your expectations which depend on where you are in terms of this process. If this would be your first ever unguided project, this PROBABLY won’t be something that is going to be super technical or earth-shattering. It’s going to be fairly simplistic, which is fine! It’s all about getting your feet wet with a few easy wins and then progressing from there. So, hold off on those really complex ideas for now.
I mean is your first time doing anything going to be that amazing? No, right.
However, for those that are a few projects deep, it’s should be about filling out your portfolio to have at least one project showcasing (1) visualization, (2) descriptive and/or inferential analysis, (3) predictive modeling, (4) machine learning application and (5) a combination of these. If you do have a particular area you would like to focus on, then your portfolio will probably be skewed towards that specialization.
With that in mind, choosing your next unguided project also comes down to finding the right balance matching project complexity with your current skill set. This is the time, to be honest with yourself as it can mean the difference between getting your project done or not. That’s not to say that you won’t be able to do it if you’re off the mark, but the key is that we want to make sure that this is as painless of a process as possible and ensure that the project actually gets completed.
Look, you will inevitably hit a snag somewhere down the road where the answer won’t be so clear at all. In fact, you’ll likely go off track in your search for the answer in order to learn to do that thing to solve your problem and fall into some rabbit hole learning something that is completely unrelated to your original goal. This will be a complete waste of time and effort that we will want to avoid at all costs if possible.
Now, you might think this won’t happen. But trust me, it will!
"Alright I know what sort of project to do, and it’s feasible for me. What’s next?"
Great. The next step is the fun part. It’s finding a topic that interests you. Now, this can be something related to your field of work or something that is of personal interest to you. More often than not, this will be a point of contention for many since there are so much data available out there to work on that it can get overwhelming quickly. While it’s tempting to want to do many things at once, I would highly suggest avoiding this as you’ll be more likely to end up with a series of half-finished work because you’ll be pulled into too many different directions. So, nail down a single choice.
There are many ways to do this (i.e., use one of those paper-fortune tellers, spin a virtual wheel, do a 10-hour gaming session fuelled by nothing but energy drinks and make a decision with a manic disposition that’s akin to being on cocaine), but I would suggest you choose a topic where you can tell a story with. This might sound weird, but keep up with me for a sec. As we’re using the projects in our portfolio to sell our skillsets, we want to provide one which is very engaging or salient to get a connection with the audience. I’ve always found an easy way to do so is through great story-telling. So, if you can find something that you’re passionate about and be able to spin an engaging story out of it, it’s definitely will be something that is worthwhile to pursue.
Think about all those times you had to make small talk or “get to know you” chats, people are usually receptive whenever there is a very engaging story being shared. Or, if you have a majestic beard.
"Cool I’ve got my idea for a project. How do I go from idea to execution?"
Alright, it’s time to get to the brass tacks here. Knowing what your project will be, I would start with planning out what needs to be done, strategize the prospective protocols needed to get certain tasks completed, and budget your time to get each step. Since most folks are more visual learners (myself included), it always helps to have a guide to visualize these thoughts based on the scope of the project. Here’s a good visual that helped me out with some of my projects:
Using a tier system like this one helps give you a good idea on how to strategize how to tackle a given project based on the scope of the project, specifically in terms of allocating resources and time. Obviously, your million-dollar question would be “What constitutes a tier-1, tier-2, or tier-3 project?” Well, that sort of depends on a few factors, but mainly has to do with (1) your vision with the project and (2) your current skill set. Here’s how I would define it:
TIER 1 – FOUNDATIONAL: These are tasks that are pretty universal across any sort of data-related task (i.e., data wrangling, web scrapping, exploratory analysis, simplistic inferential analysis [like chi-square test or t-tests], and documentation). Something similar to those guided projects in the early stages of a DQ pathway. Usually, reserve somewhere between a few hours or (if you like to procrastinate like me) 1 to 2 days to complete and is a great choice for those early along their data science path.
TIER 2 – INTERMEDIATE: We’re looking at a notable upgrade in terms of your task with things like more advanced visualization techniques, or more detailed reporting of data using inferential analyses that may involve some more classical machine learning techniques (like regression models). This would be something that would take a few days to a week of solid time to get done and would be something that would make up the bulk of your portfolio since it’s complex enough to show more advanced skills + is something not too intensive to complete.
TIER 3 – ADVANCE: This is where those cool and awesome-looking project ideas got you into data science lives. We’re talking about chatbots, automating certain tasks, predictive model implementations. Basically, these will be things that will take longer than a week to get done and would essentially serve as a highlight of your portfolio where you use it to show off those advanced skills you’ve developed. This will be the most intensive thing you will probably do in your data science journey.
Now, this example might not exactly match your current skill set as those that are probably, further along, may see this as very rudimentary, whilst those that just started out may be overwhelmed by this being the standard to judge projects. In either case, that’s totally fine! All this is meant to do is help you visualize and manage your expectations. So, if you know that certain things will just not be feasible due to how intensive a task will be, it may be worthwhile to try to reframe the scope of the project to something more manageable and in line with the prospective timeline. Or, if you’re adamant about the project, just outsource some stuff by having others collaborate with you on the project. Like if every other Fortune 500 company can do this, why shouldn’t you?
OK, now here’s the most dreaded part of all of this, setting deadlines. It’s always something that gave me a mini heart attack since it makes things “official”. Knowing this, it’s been my policy to keep a loose “deadline” with a ton of padded time in between. While it’s important to keep an end goal in mind, unexpected things always come up and the need to pivot will inevitably happen since things never go right the first time around. Plus, it’s always refreshing to be in a situation where you can “under-sell and over-deliver”, even if that’s only to yourself.
"OK, I’m ready to go. Are there any more tips you can give?"
Sure, there are a few that I can share that helped me get through a few of these:
• “BE A PRODUCTIVE PROCRASTINATOR”: I’ll have periods where I’m just not focused on the given task. So instead of just staring at a screen doing nothing for hours, just do something else that is beneficial in other facets of my life. This can be taking a break to regroup after a walk, do some outside reading that tangential to my learning, etc. As long as there is a benefit to you in some way, that’s a plus in my book.
• GAMIFY THE BORING STUFF: This is particularly notable during the data wrangling stage of any project where it’s basically just a matter of doing grunt work with reshaping the data or reformatting variables. I’ll set a personal quota to achieve and try to accomplish it for some kind of intrinsic or extrinsic reward for myself. However, DO NOT DO THIS FOR COMPLEX STUFF! Rushing it here = more time wasted to debug things.
• “PROGRESS IS PROGRESS”: It’s very easy to look at the lack of progress that you’ve made if you compare productivity on a per hour or per day basis. This will be a total bummer since there’s just too much fluctuation that can occur. However, with a long-game mindset, you can appreciate that any sort of progress (no matter how small) is meaningful and will help re-frame your thought process to be okay with some lull days.
• “BE OKAY WITH CHANGE”: Things rarely go right as originally thought in life, so we should probably expect the same with these projects. So instead of being super hung up about a particular aspect that is preventing you from progressing, I suggest that you be more open towards certain changes in order to have things completed instead of just having it sit there. Remember progress and action relies on momentum, if it stops, so does everything else.
Ok, that’s all there is to it. If you’ve read this far and can follow some of the advice that I’ve written down, you’ll be in good shape to build out your portfolio. Now I know it’s easy to talk the talk, but what about walking the walk. Well, keep an eye out over the next coming week where I’ll be putting everything that I’ve outlined here in practice. So, make sure to check that out!