Screen Link: https://app.dataquest.io/m/293/data-cleaning-basics/10/dropping-missing-values
df.dropna(axis=0) would remove columns containing NAN. This behavior of axes is quite the opposite when axes are used with methods like
I have tried searching google, but didn’t found any satisfactory answer.
Here is one video, which demonstrates that the behavior of axes varies with the method being used.
Does this help clarify your doubt?
Like the article said you can use the alternative syntax to reduce your confusion.
For added clarity, one may choose to specify
axis='index' (instead of
axis=0 ) or
axis='columns' (instead of
Even If I use named parameters, the roles of axes seems to be reversed. Isn’t?
What do you understand by the roles of the axes?
Are you referring to an x-y plot?
Consider the following dataframe:
oo = pd.DataFrame([ [1, 2, 1], [3, 4, 3], [5, 6, 5], [7, 8, 7] ])
axis =0 is passed to
oo.sum(axis=0), it works like this:
axis=0 is used with
dropna() method it remove rows, that is:
These are two different directions. That’s what confuses me.
Hello @prateek, when it comes to removing null values we refer to the axis we want to drop
- 0, or ‘index’ : Drop rows which contain missing values.
- 1, or ‘columns’ : Drop columns which contain missing value.
Actually it will depend with the method being used. Mostly for computation (mean, median etc…) it means:
- Axis 0 will act on all the ROWS in each COLUMN
- Axis 1 will act on all the COLUMNS in each ROW
Your reply was so immaculate <3