splitting#

Methods for performing train / test splits.

class hsr4hci.splitting.AlternatingSplit[source]#

Alternating split cross-validator.

Provides train / test indices to split data in train / test sets.

The split is performed in an “alternating” way: Assume that n_splits=3. In this case, the samples / data points are labeled: A B C A B C A B C … In the first split, all points labeled A or B constitute the training set, and C is the test (or hold-out) set. In the second split, all points labeled A or C are used for training and B is the test split. In the final split, A is held out and training is performed on B and C.

This splitting scheme is useful for HCI / ADI data, because it means that the effective field rotation in all splits is the same (using standard \(k\)-fold splitting would—for \(k=2\)—cut the field rotation in the training data in half).

Note

The syntax and usage is closely based on similar sklearn classes such as, e.g., sklearn.model_selection.KFold.

__init__(n_splits)[source]#

Parameters:: n_splits (int) –
Return type:: None

split(X)[source]#

Generate indices to split data into training and test set.

Parameters:

X (ndarray) – A 2D numpy array of shape (n_samples, n_features) that contains the training data.

Yields:

A 2-tuple consisting of

train_idx: A 1D numpy array containing the training set indices for that split.
test_idx: A 1D numpy array containing the testing set indices for that split.

Return type:

Iterator[Tuple[ndarray, ndarray]]