Prevent Data Loss while Dropping - Imputation

URL:
Losing data while dropping

In learn section its written that :
The analysis we did on the previous screen indicates that there are roughly 4,500 missing values across the 10 columns. The easiest option for handling these would be to drop the rows with missing values. This would mean losing almost 10% of the total data, which is something we ideally want to avoid.

Can someone Explain me mathematically how we are losing 10% data ?

null_total=vc_null_df[["vehicle_missing","cause_missing"]].sum().sum()

Output:

4573

URL correlations missing data

1 Like

I think the author could have been more precise, but I also find it sufficiently reasonable to to say roughly 10% of the rows have missing values:

print(round(vc_null_df.sum().sum()/mvc.shape[0],1)*100)
10.0
2 Likes