Start first with Missing Data or Exploratory Data Analysis?

Hi

When a dataset is missing I wonder what is more advisable :thinking:

to work with missing data first or to start with the exploratory data analysis (EDA)?

I used to start with missing data, but I read this post where he started with the EDA:
https://www.kaggle.com/pmarcelino/comprehensive-data-exploration-with-python/

1 Like

Hi @arredocana:

I personally would do EDA of the columns first and which columns have null values before dropping the columns that contain missing data/null values. EDA allows me to evaluate whether I should drop the NaN columns or replace these values with the mean if the column of data is critical for what I am trying to accomplish in the DS project.

Hope this helps!