Why double brackets?

Screen Link: https://app.dataquest.io/m/20/logistic-regression/7/predict-labels

Quick question, why does the admissions[["gpa"]] need to be in double brackets instead of single brackets when the admissions["admit"] column is in single brackets?


In this case, the difference is because of what kind of data type those two output and what is expected as input to fit()


The type of the above is -

<class ‘pandas.core.frame.DataFrame’>

It’s a DataFrame.



has a type of -

<class ‘pandas.core.series.Series’>

This is important because of the kind of shape they have as well -


The above prints out (644,)



prints out (644, 1)

Notice that 1 there.

Now why the above is relevant. If you check out the Documentation for fit(), this is what it expects as input -

X: {array-like, sparse matrix} of shape (n_samples, n_features)

y: array-like of shape (n_samples,)

Notice the shapes there for both. X, is expected to be (n_samples, n_features). In our particular case we only have 1 feature. That’s the same 1 we saw above. If we were using more than 1 columns as are features, it would be more than 1.

The double brackets essentially allow us to index the column as a DataFrame. And the DataFrame will have a shape depending on the (num_rows, num_cols).

Singular brackets will have a shape of (n_rows,) because they output a Series. And that’s what fit() expects for y. And that’s what admissions["admit"] returns as well.


Ah, ok… Thanks so much for your thorough explanation! It wasn’t immediately apparent. That makes sense now.


Great question, I always had the same question in mind…

Thanks for answering this very clearly. I was tearing my hair out trying to figure out why it was necessary for the X component, but not the y component. I guess this is why we read the documentation. :sweat_smile: