How does normalizing a dataframe work under the hood?

Hello everybody,

I’m working on the Predicting Car Prices project. The solution uses this code to normalize the dataframe:

numeric_cars = (numeric_cars - numeric_cars.min()) / (numeric_cars.max() - numeric_cars.min())

Why does this work? Don’t DataFrame.min() and DataFrame.max() return the lowest and max values in the whole dataframe?

Thank you in advance for the explanation!

Hi Nicola,

df.min() returns a series with minimum of values over the specified axis in the dataframe. By default, the axis is the index axis, or axis=0, i.e. finding the minimum through all the rows along the column (unless you specify differently, i.e. axis=1). Analogically works df.max(). Look:

df = pd.DataFrame({'column_1':[1, 50, 9, 28, 31], 
                   'column_2':[809, 0, 76, 90, 102],  
                   'column_3':[6, 42, 81, 5, 10]}) 
print(df)
print('\n')
print(df.max())

Output:

   column_1  column_2  column_3
0         1       809         6
1        50         0        42
2         9        76        81
3        28        90         5
4        31       102        10


column_1     50
column_2    809
column_3     81
dtype: int64

Merry Christmas! :christmas_tree:

Thanks! Happy holidays :partying_face:

1 Like