Why 'Mode()[0]', Not Just 'Mode()?

In 7/9 stats Module of "mode. The output of mode : 0 6. Here, is that true? 0 refers to index and 6 is the mode value?

my original code " houses[‘Mo Sold’].mode() " did not work.
The error says ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
Only I changed the code as the solution’s— mode()[0], then it works.


I did notice in our FAQ discussion forum, someone asked the same question and whoever gave the answer didn’t clarify my confusion. Do we always have to have mode()[0]? I checked geeksforgeeks or other documentation, it didn’t suggest mode()[0].


Reading the documentation

import pandas as pd
mode(self, dropna=True)
    Return the mode(s) of the dataset.
    Always returns Series even if only one value is returned.
    dropna : bool, default True
        Don't consider counts of NaN/NaT.
        .. versionadded:: 0.24.0
        Modes of the Series in sorted order.
x = houses["Mo Sold"].mode()

.mode() returns a series regardless whether one value is returned.

The series for houses["Mo Sold"].mode():

\begin{array}{|c|c|} \hline - & 0 \\ \hline 0 & 6\\ \hline \end{array}

houses["Mo Sold"].mode()[0] returns the single value in the series.

1 Like

There can be multiple modes.

x = [1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3]

Then there will be three modes with frequency 4 - 1, 2, and 3.

>>> import pandas as pd

Sample Series data:

>>> s = pd.Series([1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3])
>>> s
0     1
1     1
2     1
3     1
4     2
5     2
6     2
7     2
8     3
9     3
10    3
11    3
dtype: int64

Using .mode:

>>> x = s.mode()
>>> x
0    1
1    2
2    3
dtype: int64

Retrieving multiple modes:

>>> x[0]
>>> x[1]
>>> x[2]
1 Like

@alvinctk Thank you for your explanation. It does explain this question very well. But here comes a new question.
So the output(in my picture) of "print(mode) is 0 and 6 means what? two modes or 0 refers to index and 6 is mode?


I am confused of this picture you showed me. Is it a series or an array? 0 on the top right means column index and 0 on the bottom left means row indexing?

I did notice in stats course, I have to be careful of differentiating list/array and series because they represent different way to write the function. eg .series.sum() or sum(array).

0 is just to indicate the indexes. It is a series.
You can see it on dataquest variable checker.
x = houses["Mo Sold"].mode()

1 Like

Pandas follows the numpy convention of raising an error when you try to convert something to a bool. This happens in a if or when using the boolean operations, and, or, or not. It is not clear what the result of.


5 == pd.Series([12,2,5,10])

The result you get is a Series of booleans, equal in size to the pd.Series in the right hand side of the expression. So, you get an error. The problem here is that you are comparing a pandas pd.Series with a value, so you’ll have multiple True and multiple False values, as in the case above. This of course is ambiguous, since the condition is neither True or False. You need to further aggregate the result so that a single boolean value results from the operation. For that you’ll have to use either any or all depending on whether you want at least one (any) or all values to satisfy the condition.

(5 == pd.Series([12,2,5,10])).all()
# False


(5 == pd.Series([12,2,5,10])).any()
# True
1 Like