Skip to content

TabPFNClassifier

Bases: BaseEstimator, ClassifierMixin, TabPFNModelSelection

__init__

__init__(
    model="default",
    n_estimators: int = 4,
    preprocess_transforms: Tuple[
        PreprocessorConfig, ...
    ] = (
        PreprocessorConfig(
            "quantile_uni_coarse",
            append_original=True,
            categorical_name="ordinal_very_common_categories_shuffled",
            global_transformer_name="svd",
            subsample_features=-1,
        ),
        PreprocessorConfig(
            "none",
            categorical_name="numeric",
            subsample_features=-1,
        ),
    ),
    feature_shift_decoder: str = "shuffle",
    normalize_with_test: bool = False,
    average_logits: bool = False,
    optimize_metric: Literal[
        "auroc",
        "roc",
        "auroc_ovo",
        "balanced_acc",
        "acc",
        "log_loss",
        None,
    ] = "roc",
    transformer_predict_kwargs: Optional[dict] = None,
    multiclass_decoder="shuffle",
    softmax_temperature: Optional[float] = -0.1,
    use_poly_features=False,
    max_poly_features=50,
    remove_outliers=12.0,
    add_fingerprint_features=True,
    subsample_samples=-1,
)

Parameters:

Name Type Description Default
model

The model string is the path to the model.

'default'
n_estimators int

The number of ensemble configurations to use, the most important setting.

4
preprocess_transforms Tuple[PreprocessorConfig, ...]

A tuple of strings, specifying the preprocessing steps to use. You can use the following strings as elements '(none|power|quantile|robust)[_all][_and_none]', where the first part specifies the preprocessing step and the second part specifies the features to apply it to and finally '_and_none' specifies that the original features should be added back to the features in plain. Finally, you can combine all strings without _all with _onehot to apply one-hot encoding to the categorical features specified with self.fit(..., categorical_features=...).

(PreprocessorConfig('quantile_uni_coarse', append_original=True, categorical_name='ordinal_very_common_categories_shuffled', global_transformer_name='svd', subsample_features=-1), PreprocessorConfig('none', categorical_name='numeric', subsample_features=-1))
feature_shift_decoder str

["shuffle", "none", "local_shuffle", "rotate", "auto_rotate"] Whether to shift features for each ensemble configuration.

'shuffle'
normalize_with_test bool

If True, the test set is used to normalize the data, otherwise the training set is used only.

False
average_logits bool

Whether to average logits or probabilities for ensemble members.

False
optimize_metric Literal['auroc', 'roc', 'auroc_ovo', 'balanced_acc', 'acc', 'log_loss', None]

The optimization metric to use.

'roc'
transformer_predict_kwargs Optional[dict]

Additional keyword arguments to pass to the transformer predict method.

None
multiclass_decoder

The multiclass decoder to use.

'shuffle'
softmax_temperature Optional[float]

A log spaced temperature, it will be applied as logits <- logits/exp(softmax_temperature).

-0.1
use_poly_features

Whether to use polynomial features as the last preprocessing step.

False
max_poly_features

Maximum number of polynomial features to use.

50
remove_outliers

If not 0.0, will remove outliers from the input features, where values with a standard deviation larger than remove_outliers will be removed.

12.0
add_fingerprint_features

If True, will add one feature of random values, that will be added to the input features. This helps discern duplicated samples in the transformer model.

True
subsample_samples

If not None, will use a random subset of the samples for training in each ensemble configuration. If 1 or above, this will subsample to the specified number of samples. If in 0 to 1, the value is viewed as a fraction of the training set size.

-1

fit

fit(X, y)

predict

predict(X)

predict_proba

predict_proba(X)