greedy_weighted_ensemble ¶
GreedyWeightedEnsemble ¶
Bases: AbstractValidationUtils
get_oof_per_estimator ¶
get_oof_per_estimator(
X: ndarray,
y: ndarray,
*,
return_loss_per_estimator: bool = False,
impute_dropped_instances: bool = True,
_extra_processing: bool = False
) -> list[ndarray] | tuple[list[ndarray], list[float]]
Get OOF predictions for each base model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
ndarray
|
training data (features) |
required |
y |
ndarray
|
training labels |
required |
return_loss_per_estimator |
bool
|
if True, also return the loss per estimator. |
False
|
impute_dropped_instances |
bool
|
if True, impute instances that were dropped during the splits (e.g., due to not enough instances per class). |
True
|
_extra_processing |
bool
|
|
False
|
either only OOF predictions or OOF predictions and loss per estimator.
Type | Description |
---|---|
list[ndarray] | tuple[list[ndarray], list[float]]
|
If self.is_holdout is True, the OOF predictions can return NaN values for instances not covered during repeated holdout. |
not_enough_time ¶
Simple heuristic to stop cross-validation early if not enough time is left for another repeat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
current_repeat |
int
|
The current repeat index |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if there likely isn't enough time for another repeat, False otherwise |
Note
This is a heuristic based on average time per repeat so far and may not be exact.
set_time_limit ¶
Initialize the timer for time-limited execution.
Sets the start time for time limit tracking and logs the time limit info. This method should be called at the beginning of validation.
time_limit_reached ¶
Check if the time limit for execution has been reached.
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if the time limit has been reached, False otherwise or if no time limit was set |
GreedyWeightedEnsembleClassifier ¶
Bases: GreedyWeightedEnsemble
, AbstractValidationUtilsClassification
get_oof_per_estimator ¶
get_oof_per_estimator(
X: ndarray,
y: ndarray,
*,
return_loss_per_estimator: bool = False,
impute_dropped_instances: bool = True,
_extra_processing: bool = False
) -> list[ndarray] | tuple[list[ndarray], list[float]]
Get OOF predictions for each base model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
ndarray
|
training data (features) |
required |
y |
ndarray
|
training labels |
required |
return_loss_per_estimator |
bool
|
if True, also return the loss per estimator. |
False
|
impute_dropped_instances |
bool
|
if True, impute instances that were dropped during the splits (e.g., due to not enough instances per class). |
True
|
_extra_processing |
bool
|
|
False
|
either only OOF predictions or OOF predictions and loss per estimator.
Type | Description |
---|---|
list[ndarray] | tuple[list[ndarray], list[float]]
|
If self.is_holdout is True, the OOF predictions can return NaN values for instances not covered during repeated holdout. |
not_enough_time ¶
Simple heuristic to stop cross-validation early if not enough time is left for another repeat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
current_repeat |
int
|
The current repeat index |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if there likely isn't enough time for another repeat, False otherwise |
Note
This is a heuristic based on average time per repeat so far and may not be exact.
set_time_limit ¶
Initialize the timer for time-limited execution.
Sets the start time for time limit tracking and logs the time limit info. This method should be called at the beginning of validation.
time_limit_reached ¶
Check if the time limit for execution has been reached.
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if the time limit has been reached, False otherwise or if no time limit was set |
GreedyWeightedEnsembleRegressor ¶
Bases: GreedyWeightedEnsemble
, AbstractValidationUtilsRegression
get_oof_per_estimator ¶
get_oof_per_estimator(
X: ndarray,
y: ndarray,
*,
return_loss_per_estimator: bool = False,
impute_dropped_instances: bool = True,
_extra_processing: bool = False
) -> list[ndarray] | tuple[list[ndarray], list[float]]
Get OOF predictions for each base model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
X |
ndarray
|
training data (features) |
required |
y |
ndarray
|
training labels |
required |
return_loss_per_estimator |
bool
|
if True, also return the loss per estimator. |
False
|
impute_dropped_instances |
bool
|
if True, impute instances that were dropped during the splits (e.g., due to not enough instances per class). |
True
|
_extra_processing |
bool
|
|
False
|
either only OOF predictions or OOF predictions and loss per estimator.
Type | Description |
---|---|
list[ndarray] | tuple[list[ndarray], list[float]]
|
If self.is_holdout is True, the OOF predictions can return NaN values for instances not covered during repeated holdout. |
not_enough_time ¶
Simple heuristic to stop cross-validation early if not enough time is left for another repeat.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
current_repeat |
int
|
The current repeat index |
required |
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if there likely isn't enough time for another repeat, False otherwise |
Note
This is a heuristic based on average time per repeat so far and may not be exact.
set_time_limit ¶
Initialize the timer for time-limited execution.
Sets the start time for time limit tracking and logs the time limit info. This method should be called at the beginning of validation.
time_limit_reached ¶
Check if the time limit for execution has been reached.
Returns:
Name | Type | Description |
---|---|---|
bool |
bool
|
True if the time limit has been reached, False otherwise or if no time limit was set |
caruana_weighted ¶
Caruana's ensemble selection with replacement.