DF = DF.fillna(houses.mode) DF.isnull().sum()
num_missing = df.select_dtypes(include=['int', 'float']).isnull().sum() fixable_numeric_cols = num_missing[(num_missing < len(df)/20) & (num_missing > 0)].sort_values() replacement_values_dict = df[fixable_numeric_cols.index].mode().to_dict(orient='records') df = df.fillna(replacement_values_dict)
I feel as though I achived everything in the solution code with my simpler line of code. Seeing as I have already (prior to this) removed all text columns with missing values from the read-in DF, as well as removed all numeric columns with more than 5% of their values missing. All that is left then should be the numeric cols with 5% or less of their values missing. So potentially, I should be able to fill the missing values with their respective column modes as stipulated in my code above.
There is however a large chance that I have done something wrong or am missing something in my approach. I would appreciate any advice suggestions and maybe any clarity on whether my code is sufficient.
As a side note, I was hoping that someone may be able to explain to me why we sort the values for “fixable_numeric_cols”? I see that this is done a lot and am at a loss why we would need to sort these cols?
All help and comments/suggestions/.honest criticism would be forever appreciated