Skip to content

TabPFNRegressor

Bases: BaseEstimator, RegressorMixin, TabPFNModelSelection

__init__

__init__(
    model: str = "default",
    n_estimators: int = 8,
    preprocess_transforms: Tuple[
        PreprocessorConfig, ...
    ] = (
        PreprocessorConfig(
            "quantile_uni",
            append_original=True,
            categorical_name="ordinal_very_common_categories_shuffled",
            global_transformer_name="svd",
        ),
        PreprocessorConfig(
            "safepower", categorical_name="onehot"
        ),
    ),
    feature_shift_decoder: str = "shuffle",
    normalize_with_test: bool = False,
    average_logits: bool = False,
    optimize_metric: Literal[
        "mse",
        "rmse",
        "mae",
        "r2",
        "mean",
        "median",
        "mode",
        "exact_match",
        None,
    ] = "rmse",
    transformer_predict_kwargs: Optional[Dict] = None,
    softmax_temperature: Optional[float] = -0.1,
    use_poly_features=False,
    max_poly_features=50,
    remove_outliers=-1,
    regression_y_preprocess_transforms: Optional[
        Tuple[
            None
            | Literal[
                "safepower", "power", "quantile_norm"
            ],
            ...,
        ]
    ] = (None, "safepower"),
    add_fingerprint_features: bool = True,
    cancel_nan_borders: bool = True,
    super_bar_dist_averaging: bool = False,
    subsample_samples: float = -1,
)

Parameters:

Name Type Description Default
model str

The model string is the path to the model.

'default'
n_estimators int

The number of ensemble configurations to use, the most important setting.

8
preprocess_transforms Tuple[PreprocessorConfig, ...]

A tuple of strings, specifying the preprocessing steps to use. You can use the following strings as elements '(none|power|quantile_norm|quantile_uni|quantile_uni_coarse|robust...)[_all][_and_none]', where the first part specifies the preprocessing step (see .preprocessing.ReshapeFeatureDistributionsStep.get_all_preprocessors()) and the second part specifies the features to apply it to and finally '_and_none' specifies that the original features should be added back to the features in plain. Finally, you can combine all strings without _all with _onehot to apply one-hot encoding to the categorical features specified with self.fit(..., categorical_features=...).

(PreprocessorConfig('quantile_uni', append_original=True, categorical_name='ordinal_very_common_categories_shuffled', global_transformer_name='svd'), PreprocessorConfig('safepower', categorical_name='onehot'))
feature_shift_decoder str

["shuffle", "none", "local_shuffle", "rotate", "auto_rotate"] Whether to shift features for each ensemble configuration.

'shuffle'
normalize_with_test bool

If True, the test set is used to normalize the data, otherwise the training set is used only.

False
average_logits bool

Whether to average logits or probabilities for ensemble members.

False
optimize_metric Literal['mse', 'rmse', 'mae', 'r2', 'mean', 'median', 'mode', 'exact_match', None]

The optimization metric to use.

'rmse'
transformer_predict_kwargs Optional[Dict]

Additional keyword arguments to pass to the transformer predict method.

None
softmax_temperature Optional[float]

A log spaced temperature, it will be applied as logits <- logits/exp(softmax_temperature).

-0.1
use_poly_features

Whether to use polynomial features as the last preprocessing step.

False
max_poly_features

Maximum number of polynomial features to use, None means unlimited.

50
remove_outliers

If not 0.0, will remove outliers from the input features, where values with a standard deviation larger than remove_outliers will be removed.

-1
regression_y_preprocess_transforms Optional[Tuple[None | Literal['safepower', 'power', 'quantile_norm'], ...]]

Preprocessing transforms for the target variable. This can be one from .preprocessing.ReshapeFeatureDistributionsStep.get_all_preprocessors(), e.g. "power". This can also be None to not transform the targets, beside a simple mean/variance normalization.

(None, 'safepower')
add_fingerprint_features bool

If True, will add one feature of random values, that will be added to the input features. This helps discern duplicated samples in the transformer model.

True
cancel_nan_borders bool

Whether to ignore buckets that are tranformed to nan values by inverting a regression_y_preprocess_transform. This should be set to True, only set this to False if you know what you are doing.

True
super_bar_dist_averaging bool

If we use regression_y_preprocess_transforms we need to average the predictions over the different configurations. The different configurations all come with different bar_distributions (Riemann distributions), though. The default is for us to aggregate all bar distributions using simply scaled borders in the bar distribution, scaled by the mean and std of the target variable. If you set this to True, a new bar distribution will be built using all the borders generated in the different configurations.

False
subsample_samples float

If not None, will use a random subset of the samples for training in each ensemble configuration. If 1 or above, this will subsample to the specified number of samples. If in 0 to 1, the value is viewed as a fraction of the training set size.

-1

fit

fit(X, y)

predict

predict(X)

predict_full

predict_full(X)