What the double brackets do is they produce the output as a two-dimensional array, in this case a dataframe, as opposed to a series.
So while the output of
split_one[[“accommodates”]] probably look very similar, they are actually different object types!
You can verify this yourself!
When a single bracket is used to isolate that column, you see that you’re only returning a series object of shape (1862,). This is not a 2D object because the number of columns isn’t defined, only the rows are.
This is different when you use double brackets, because you’re now returning that column not as a series, but as a single-column dataframe.
You see that both dimensions of the dataframe object are now specified, because it actually is a dataframe object, and not a series, even though it might look really similar to its series counterpart.
As to why that matters, understand that in any generic
model.fit(X, y) line of code using scikit-learn, the data represented by the predictor variable X is expected by scikit-learn models to be 2-dimensional!
Incidentally, this is also why the “
model.fit(X, y) is represented using an upper case “
X”, because an upper-case variable is conventionally used to define a matrix (i.e. 2D) object. The
y variable is lower case because it doesn’t have to be 2-dimensional.