What hardware/setup do you use for Data Analysis projects?

Hi everyone,

I have been reading a lot through the posts of people’s own data science projects and I must say it really sparked my interest for data analysis and the stuff you can learn from your dataset!
I was wondering what you guys have as a hardware setup to perform your analyses on? More specifically, I had some ideas for the TLC yellow cab dataset and was trying to start my own little project on this, but opening only parts of the dataset (approx. 500 MB) pandas took like forever…

I am sorry if this is a noob question, I am relatively new to data science and have zero background in IT etc.

Thanks in advance for your feedback,

Hi @marcus_wegmann:

I would say it depends on your dataset and what kind of algorithm you are using. If the task is more computationally expensive like training Convolutional Neural Network, then a decent CPU (at least 4 cores), discrete graphics card and decent amount of RAM (min 8gb) is advised. If it is simple tasks like EDA or Visualizations, database queries etc., then 8gb of RAM should be sufficient. However, there is also the possibility of using cloud services like Google Colab for simple workloads or Google Cloud Platform and AWS to perform more intensive workloads or if you can afford paying a price for it.

That being said, based on your intentions – how long you will be coding a day, whether you need to running additional high or low intensity workloads etc., whether you require video/photo editing software for other purposes, how frequent you will be at home or work, whether your prefer portability (laptop), a dedicated physical desktop or server, or convenience whereever you go (cloud), as well as your budget will help you decide the kind and amount of hardware you need.

Here are some additional articles you may find useful:

Physical Workstation/Laptop


For me I use a combination of Google Colab and Anaconda/Visual Studio Code on my Windows Machine (i7, 16GB RAM, 256GB SSD, 1TB HDD, Nvidia GTX 1050) – bought it mainly for intensive Cyber Security and Forensic work, not really for Data Science per se but I found model training time is reduce compared to lower-end hardware.

Hope these help!

Hey @masterryan.prof,

thanks for your suggestions. I checked out Google Colab yesterday and I think it’s gonna be perfectly okay for a start!

Thanks again for your help,

No worries and enjoy your learning journey @marcus_wegmann!