Skip to content

Prior Labs

TabPFNClassifier

Jobs

TabPFNClassifier ¶

Bases: BaseEstimator, ClassifierMixin, TabPFNModelSelection

init ¶

__init__(
    model="default",
    n_estimators: int = 4,
    preprocess_transforms: Tuple[
        PreprocessorConfig, ...
    ] = (
        PreprocessorConfig(
            "quantile_uni_coarse",
            append_original=True,
            categorical_name="ordinal_very_common_categories_shuffled",
            global_transformer_name="svd",
            subsample_features=-1,
        ),
        PreprocessorConfig(
            "none",
            categorical_name="numeric",
            subsample_features=-1,
        ),
    ),
    feature_shift_decoder: str = "shuffle",
    normalize_with_test: bool = False,
    average_logits: bool = False,
    optimize_metric: Literal[
        "auroc",
        "roc",
        "auroc_ovo",
        "balanced_acc",
        "acc",
        "log_loss",
        None,
    ] = "roc",
    transformer_predict_kwargs: Optional[dict] = None,
    multiclass_decoder="shuffle",
    softmax_temperature: Optional[float] = -0.1,
    use_poly_features=False,
    max_poly_features=50,
    remove_outliers=12.0,
    add_fingerprint_features=True,
    subsample_samples=-1,
)

Parameters:

Name	Type	Description	Default
`model`		The model string is the path to the model.	`'default'`
`n_estimators`	`int`	The number of ensemble configurations to use, the most important setting.	`4`
`preprocess_transforms`	`Tuple[PreprocessorConfig, ...]`	A tuple of strings, specifying the preprocessing steps to use. You can use the following strings as elements '(none\|power\|quantile\|robust)[_all][_and_none]', where the first part specifies the preprocessing step and the second part specifies the features to apply it to and finally '_and_none' specifies that the original features should be added back to the features in plain. Finally, you can combine all strings without `_all` with `_onehot` to apply one-hot encoding to the categorical features specified with `self.fit(..., categorical_features=...)`.	`(PreprocessorConfig('quantile_uni_coarse', append_original=True, categorical_name='ordinal_very_common_categories_shuffled', global_transformer_name='svd', subsample_features=-1), PreprocessorConfig('none', categorical_name='numeric', subsample_features=-1))`
`feature_shift_decoder`	`str`	["shuffle", "none", "local_shuffle", "rotate", "auto_rotate"] Whether to shift features for each ensemble configuration.	`'shuffle'`
`normalize_with_test`	`bool`	If True, the test set is used to normalize the data, otherwise the training set is used only.	`False`
`average_logits`	`bool`	Whether to average logits or probabilities for ensemble members.	`False`
`optimize_metric`	`Literal['auroc', 'roc', 'auroc_ovo', 'balanced_acc', 'acc', 'log_loss', None]`	The optimization metric to use.	`'roc'`
`transformer_predict_kwargs`	`Optional[dict]`	Additional keyword arguments to pass to the transformer predict method.	`None`
`multiclass_decoder`		The multiclass decoder to use.	`'shuffle'`
`softmax_temperature`	`Optional[float]`	A log spaced temperature, it will be applied as logits <- logits/exp(softmax_temperature).	`-0.1`
`use_poly_features`		Whether to use polynomial features as the last preprocessing step.	`False`
`max_poly_features`		Maximum number of polynomial features to use.	`50`
`remove_outliers`		If not 0.0, will remove outliers from the input features, where values with a standard deviation larger than remove_outliers will be removed.	`12.0`
`add_fingerprint_features`		If True, will add one feature of random values, that will be added to the input features. This helps discern duplicated samples in the transformer model.	`True`
`subsample_samples`		If not None, will use a random subset of the samples for training in each ensemble configuration. If 1 or above, this will subsample to the specified number of samples. If in 0 to 1, the value is viewed as a fraction of the training set size.	`-1`

fit ¶

fit(X, y)

predict ¶

predict(X)

predict_proba ¶

predict_proba(X)