Skip to content

greedy_weighted_ensemble

GreedyWeightedEnsemble

Bases: AbstractValidationUtils

get_oof_per_estimator

get_oof_per_estimator(
    X: ndarray,
    y: ndarray,
    *,
    return_loss_per_estimator: bool = False,
    impute_dropped_instances: bool = True,
    _extra_processing: bool = False
) -> list[ndarray] | tuple[list[ndarray], list[float]]

Get OOF predictions for each base model.

Parameters:

Name Type Description Default
X ndarray

training data (features)

required
y ndarray

training labels

required
return_loss_per_estimator bool

if True, also return the loss per estimator.

False
impute_dropped_instances bool

if True, impute instances that were dropped during the splits (e.g., due to not enough instances per class).

True
_extra_processing bool
False

either only OOF predictions or OOF predictions and loss per estimator.

Type Description
list[ndarray] | tuple[list[ndarray], list[float]]

If self.is_holdout is True, the OOF predictions can return NaN values for instances not covered during repeated holdout.

not_enough_time

not_enough_time(current_repeat: int) -> bool

Simple heuristic to stop cross-validation early if not enough time is left for another repeat.

GreedyWeightedEnsembleClassifier

Bases: GreedyWeightedEnsemble, AbstractValidationUtilsClassification

get_oof_per_estimator

get_oof_per_estimator(
    X: ndarray,
    y: ndarray,
    *,
    return_loss_per_estimator: bool = False,
    impute_dropped_instances: bool = True,
    _extra_processing: bool = False
) -> list[ndarray] | tuple[list[ndarray], list[float]]

Get OOF predictions for each base model.

Parameters:

Name Type Description Default
X ndarray

training data (features)

required
y ndarray

training labels

required
return_loss_per_estimator bool

if True, also return the loss per estimator.

False
impute_dropped_instances bool

if True, impute instances that were dropped during the splits (e.g., due to not enough instances per class).

True
_extra_processing bool
False

either only OOF predictions or OOF predictions and loss per estimator.

Type Description
list[ndarray] | tuple[list[ndarray], list[float]]

If self.is_holdout is True, the OOF predictions can return NaN values for instances not covered during repeated holdout.

not_enough_time

not_enough_time(current_repeat: int) -> bool

Simple heuristic to stop cross-validation early if not enough time is left for another repeat.

GreedyWeightedEnsembleRegressor

Bases: GreedyWeightedEnsemble, AbstractValidationUtilsRegression

get_oof_per_estimator

get_oof_per_estimator(
    X: ndarray,
    y: ndarray,
    *,
    return_loss_per_estimator: bool = False,
    impute_dropped_instances: bool = True,
    _extra_processing: bool = False
) -> list[ndarray] | tuple[list[ndarray], list[float]]

Get OOF predictions for each base model.

Parameters:

Name Type Description Default
X ndarray

training data (features)

required
y ndarray

training labels

required
return_loss_per_estimator bool

if True, also return the loss per estimator.

False
impute_dropped_instances bool

if True, impute instances that were dropped during the splits (e.g., due to not enough instances per class).

True
_extra_processing bool
False

either only OOF predictions or OOF predictions and loss per estimator.

Type Description
list[ndarray] | tuple[list[ndarray], list[float]]

If self.is_holdout is True, the OOF predictions can return NaN values for instances not covered during repeated holdout.

not_enough_time

not_enough_time(current_repeat: int) -> bool

Simple heuristic to stop cross-validation early if not enough time is left for another repeat.

caruana_weighted

caruana_weighted(
    predictions: list[ndarray],
    labels: ndarray,
    seed,
    n_iterations,
    loss_function,
)

Caruana's ensemble selection with replacement.