Kf.split(features)

Hello, everyone!
I have a question, what does kf.split(features) do in this for loop?

for train_index, test_index in kf.split(features):
# Training and test sets.
X_train, X_test = features.iloc[train_index], features.iloc[test_index]
y_train, y_test = target.iloc[train_index], target.iloc[test_index]

This will simply return the indices of training and test sets for the particular group or fold.

See doc - KFold.split
This is example from doc

>>> X = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
>>> y = np.array([1, 2, 3, 4])
>>> kf = KFold(n_splits=2)

>>> for train_index, test_index in kf.split(X):
...     print("TRAIN:", train_index, "TEST:", test_index)
TRAIN: [2 3] TEST: [0 1]
TRAIN: [0 1] TEST: [2 3]

In the example we have split data into two groups so kf.split will return indices for training and test data for each group (two times).

1 Like