Skip to content

bar_distribution

BarDistribution

Bases: Module

average_bar_distributions_into_this

average_bar_distributions_into_this(
    list_of_bar_distributions: Sequence[BarDistribution],
    list_of_logits: Sequence[Tensor],
    *,
    average_logits: bool = False
) -> Tensor

:param list_of_bar_distributions: :param list_of_logits: :param average_logits: :return:

cdf

cdf(logits: Tensor, ys: Tensor) -> Tensor

Calculates the cdf of the distribution described by the logits. The cdf is scaled by the width of the bars.

Parameters:

Name Type Description Default
logits Tensor

tensor of shape (batch_size, ..., num_bars) with the logits describing the distribution

required
ys Tensor

tensor of shape (batch_size, ..., n_ys to eval) or (n_ys to eval) with the targets.

required

cdf_temporary

cdf_temporary(logits: Tensor) -> Tensor

Cumulative distribution function.

TODO: this already exists here, make sure to merge, at the moment still used.

get_probs_for_different_borders

get_probs_for_different_borders(
    logits: Tensor, new_borders: Tensor
) -> Tensor

The logits describe the density of the distribution over the current self.borders.

This function returns the logits if the self.borders were changed to new_borders. This is useful to average the logits of different models.

icdf

icdf(logits: Tensor, left_prob: float) -> Tensor

Implementation of the quantile function :param logits: Tensor of any shape, with the last dimension being logits :param left_prob: float: The probability mass to the left of the result. :return: Position with left_prob probability weight to the left.

mean_of_square

mean_of_square(logits: Tensor) -> Tensor

Computes E[x^2].

Parameters:

Name Type Description Default
logits Tensor

Output of the model.

required

Returns:

Type Description
Tensor

mean of square

pi

pi(
    logits: Tensor,
    best_f: float | Tensor,
    *,
    maximize: bool = True
) -> Tensor

Acquisition Function: Probability of Improvement.

Parameters:

Name Type Description Default
logits Tensor

as returned by Transformer

required
best_f float | Tensor

best evaluation so far (the incumbent)

required
maximize bool

whether to maximize

True

Returns:

Type Description
Tensor

probability of improvement

plot

plot(
    logits: Tensor,
    ax: Axes | None = None,
    zoom_to_quantile: float | None = None,
    **kwargs: Any
) -> Axes

Plots the distribution.

ucb

ucb(
    logits: Tensor,
    best_f: float,
    rest_prob: float = 1 - 0.682 / 2,
    *,
    maximize: bool = True
) -> Tensor

UCB utility. Rest Prob is the amount of utility above (below) the confidence interval that is ignored.

Higher rest_prob is equivalent to lower beta in the standard GP-UCB formulation.

Parameters:

Name Type Description Default
logits Tensor

Logits, as returned by the Transformer.

required
rest_prob float

The amount of utility above (below) the confidence interval that is ignored.

The default is equivalent to using GP-UCB with beta=1. To get the corresponding beta, where beta is from the standard GP definition of UCB ucb_utility = mean + beta * std, you can use this computation:

beta = math.sqrt(2)*torch.erfinv(torch.tensor(2*(1-rest_prob)-1))

1 - 0.682 / 2
best_f float

Unused

required
maximize bool

Whether to maximize.

True

FullSupportBarDistribution

Bases: BarDistribution

average_bar_distributions_into_this

average_bar_distributions_into_this(
    list_of_bar_distributions: Sequence[BarDistribution],
    list_of_logits: Sequence[Tensor],
    *,
    average_logits: bool = False
) -> Tensor

:param list_of_bar_distributions: :param list_of_logits: :param average_logits: :return:

cdf

cdf(logits: Tensor, ys: Tensor) -> Tensor

Calculates the cdf of the distribution described by the logits. The cdf is scaled by the width of the bars.

Parameters:

Name Type Description Default
logits Tensor

tensor of shape (batch_size, ..., num_bars) with the logits describing the distribution

required
ys Tensor

tensor of shape (batch_size, ..., n_ys to eval) or (n_ys to eval) with the targets.

required

cdf_temporary

cdf_temporary(logits: Tensor) -> Tensor

Cumulative distribution function.

TODO: this already exists here, make sure to merge, at the moment still used.

ei_for_halfnormal

ei_for_halfnormal(
    scale: float,
    best_f: Tensor | float,
    *,
    maximize: bool = True
) -> Tensor

EI for a standard normal distribution with mean 0 and variance scale times 2.

Which is the same as the half normal EI. Tested this with MC approximation:

ei_for_halfnormal = lambda scale, best_f: (torch.distributions.HalfNormal(torch.tensor(scale)).sample((10_000_000,))- best_f ).clamp(min=0.).mean()
print([(ei_for_halfnormal(scale,best_f), FullSupportBarDistribution().ei_for_halfnormal(scale,best_f)) for scale in [0.1,1.,10.] for best_f in [.1,10.,4.]])

forward

forward(
    logits: Tensor,
    y: Tensor,
    mean_prediction_logits: Tensor | None = None,
) -> Tensor

Returns the negative log density (the loss).

y: T x B, logits: T x B x self.num_bars.

:param logits: Tensor of shape T x B x self.num_bars :param y: Tensor of shape T x B :param mean_prediction_logits: :return:

get_probs_for_different_borders

get_probs_for_different_borders(
    logits: Tensor, new_borders: Tensor
) -> Tensor

The logits describe the density of the distribution over the current self.borders.

This function returns the logits if the self.borders were changed to new_borders. This is useful to average the logits of different models.

icdf

icdf(logits: Tensor, left_prob: float) -> Tensor

Implementation of the quantile function :param logits: Tensor of any shape, with the last dimension being logits :param left_prob: float: The probability mass to the left of the result. :return: Position with left_prob probability weight to the left.

mean_of_square

mean_of_square(logits: Tensor) -> Tensor

Computes E[x^2].

Parameters:

Name Type Description Default
logits Tensor

Output of the model.

required

pdf

pdf(logits: Tensor, y: Tensor) -> Tensor

Probability density function at y.

pi

pi(
    logits: Tensor,
    best_f: Tensor | float,
    *,
    maximize: bool = True
) -> Tensor

Acquisition Function: Probability of Improvement.

Parameters:

Name Type Description Default
logits Tensor

as returned by Transformer (evaluation_points x batch x feature_dim)

required
best_f Tensor | float

best evaluation so far (the incumbent)

required
maximize bool

whether to maximize

True

plot

plot(
    logits: Tensor,
    ax: Axes | None = None,
    zoom_to_quantile: float | None = None,
    **kwargs: Any
) -> Axes

Plots the distribution.

sample

sample(logits: Tensor, t: float = 1.0) -> Tensor

Samples values from the distribution.

Temperature t.

ucb

ucb(
    logits: Tensor,
    best_f: float,
    rest_prob: float = 1 - 0.682 / 2,
    *,
    maximize: bool = True
) -> Tensor

UCB utility. Rest Prob is the amount of utility above (below) the confidence interval that is ignored.

Higher rest_prob is equivalent to lower beta in the standard GP-UCB formulation.

Parameters:

Name Type Description Default
logits Tensor

Logits, as returned by the Transformer.

required
rest_prob float

The amount of utility above (below) the confidence interval that is ignored.

The default is equivalent to using GP-UCB with beta=1. To get the corresponding beta, where beta is from the standard GP definition of UCB ucb_utility = mean + beta * std, you can use this computation:

beta = math.sqrt(2)*torch.erfinv(torch.tensor(2*(1-rest_prob)-1))

1 - 0.682 / 2
best_f float

Unused

required
maximize bool

Whether to maximize.

True

get_bucket_limits

get_bucket_limits(
    num_outputs: int,
    full_range: tuple | None = None,
    ys: Tensor | None = None,
    *,
    verbose: bool = False,
    widen_bucket_limits_factor: float | None = None
) -> Tensor

Decide for a set of bucket limits based on a distritbution of ys.

Parameters:

Name Type Description Default
num_outputs int

This is only tested for num_outputs=1, but should work for larger num_outputs as well.

required
full_range tuple | None

If ys is not passed, this is the range of the ys that should be used to estimate the bucket limits.

None
ys Tensor | None

If ys is passed, this is the ys that should be used to estimate the bucket limits. Do not pass full_range in this case.

None
verbose bool

Unused

False
widen_bucket_limits_factor float | None

If set, the bucket limits are widened by this factor. This allows to have a slightly larger range than the actual data.

None