bar_distribution ¶

BarDistribution ¶

Bases: Module

average_bar_distributions_into_this ¶

average_bar_distributions_into_this(
    list_of_bar_distributions: Sequence[BarDistribution],
    list_of_logits: Sequence[Tensor],
    *,
    average_logits: bool = False
) -> Tensor

:param list_of_bar_distributions: :param list_of_logits: :param average_logits: :return:

cdf ¶

cdf(logits: Tensor, ys: Tensor) -> Tensor

Calculates the cdf of the distribution described by the logits. The cdf is scaled by the width of the bars.

Parameters:

Name	Type	Description	Default
`logits`	`Tensor`	tensor of shape (batch_size, ..., num_bars) with the logits describing the distribution	required
`ys`	`Tensor`	tensor of shape (batch_size, ..., n_ys to eval) or (n_ys to eval) with the targets.	required

cdf_temporary ¶

cdf_temporary(logits: Tensor) -> Tensor

Cumulative distribution function.

TODO: this already exists here, make sure to merge, at the moment still used.

get_probs_for_different_borders ¶

get_probs_for_different_borders(
    logits: Tensor, new_borders: Tensor
) -> Tensor

The logits describe the density of the distribution over the current self.borders.

This function returns the logits if the self.borders were changed to new_borders. This is useful to average the logits of different models.

icdf ¶

icdf(logits: Tensor, left_prob: float) -> Tensor

Implementation of the quantile function :param logits: Tensor of any shape, with the last dimension being logits :param left_prob: float: The probability mass to the left of the result. :return: Position with left_prob probability weight to the left.

mean_of_square ¶

mean_of_square(logits: Tensor) -> Tensor

Computes E[x^2].

Parameters:

Name	Type	Description	Default
`logits`	`Tensor`	Output of the model.	required

Returns:

Type	Description
`Tensor`	mean of square

pi ¶

pi(
    logits: Tensor,
    best_f: float | Tensor,
    *,
    maximize: bool = True
) -> Tensor

Acquisition Function: Probability of Improvement.

Parameters:

Name	Type	Description	Default
`logits`	`Tensor`	as returned by Transformer	required
`best_f`	`float \| Tensor`	best evaluation so far (the incumbent)	required
`maximize`	`bool`	whether to maximize	`True`

Returns:

Type	Description
`Tensor`	probability of improvement

plot ¶

plot(
    logits: Tensor,
    ax: Axes | None = None,
    zoom_to_quantile: float | None = None,
    **kwargs: Any
) -> Axes

Plots the distribution.

ucb ¶

ucb(
    logits: Tensor,
    best_f: float,
    rest_prob: float = 1 - 0.682 / 2,
    *,
    maximize: bool = True
) -> Tensor

UCB utility. Rest Prob is the amount of utility above (below) the confidence interval that is ignored.

Higher rest_prob is equivalent to lower beta in the standard GP-UCB formulation.

Parameters:

Name	Type	Description	Default
`logits`	`Tensor`	Logits, as returned by the Transformer.	required
`rest_prob`	`float`	The amount of utility above (below) the confidence interval that is ignored. The default is equivalent to using GP-UCB with `beta=1`. To get the corresponding `beta`, where `beta` is from the standard GP definition of UCB `ucb_utility = mean + beta * std`, you can use this computation: `beta = math.sqrt(2)torch.erfinv(torch.tensor(2(1-rest_prob)-1))`	`1 - 0.682 / 2`
`best_f`	`float`	Unused	required
`maximize`	`bool`	Whether to maximize.	`True`

FullSupportBarDistribution ¶

Bases: BarDistribution

average_bar_distributions_into_this ¶

average_bar_distributions_into_this(
    list_of_bar_distributions: Sequence[BarDistribution],
    list_of_logits: Sequence[Tensor],
    *,
    average_logits: bool = False
) -> Tensor

:param list_of_bar_distributions: :param list_of_logits: :param average_logits: :return:

cdf ¶

cdf(logits: Tensor, ys: Tensor) -> Tensor

Calculates the cdf of the distribution described by the logits. The cdf is scaled by the width of the bars.

Parameters:

Name	Type	Description	Default
`logits`	`Tensor`	tensor of shape (batch_size, ..., num_bars) with the logits describing the distribution	required
`ys`	`Tensor`	tensor of shape (batch_size, ..., n_ys to eval) or (n_ys to eval) with the targets.	required

cdf_temporary ¶

cdf_temporary(logits: Tensor) -> Tensor

Cumulative distribution function.

TODO: this already exists here, make sure to merge, at the moment still used.

ei_for_halfnormal ¶

ei_for_halfnormal(
    scale: float,
    best_f: Tensor | float,
    *,
    maximize: bool = True
) -> Tensor

EI for a standard normal distribution with mean 0 and variance scale times 2.

Which is the same as the half normal EI. Tested this with MC approximation:

ei_for_halfnormal = lambda scale, best_f: (torch.distributions.HalfNormal(torch.tensor(scale)).sample((10_000_000,))- best_f ).clamp(min=0.).mean()
print([(ei_for_halfnormal(scale,best_f), FullSupportBarDistribution().ei_for_halfnormal(scale,best_f)) for scale in [0.1,1.,10.] for best_f in [.1,10.,4.]])

forward ¶

forward(
    logits: Tensor,
    y: Tensor,
    mean_prediction_logits: Tensor | None = None,
) -> Tensor

Returns the negative log density (the loss).

y: T x B, logits: T x B x self.num_bars.

:param logits: Tensor of shape T x B x self.num_bars :param y: Tensor of shape T x B :param mean_prediction_logits: :return:

get_probs_for_different_borders ¶

get_probs_for_different_borders(
    logits: Tensor, new_borders: Tensor
) -> Tensor

The logits describe the density of the distribution over the current self.borders.

This function returns the logits if the self.borders were changed to new_borders. This is useful to average the logits of different models.

icdf ¶

icdf(logits: Tensor, left_prob: float) -> Tensor

Implementation of the quantile function :param logits: Tensor of any shape, with the last dimension being logits :param left_prob: float: The probability mass to the left of the result. :return: Position with left_prob probability weight to the left.

mean_of_square ¶

mean_of_square(logits: Tensor) -> Tensor

Computes E[x^2].

Parameters:

Name	Type	Description	Default
`logits`	`Tensor`	Output of the model.	required

pdf ¶

pdf(logits: Tensor, y: Tensor) -> Tensor

Probability density function at y.

pi ¶

pi(
    logits: Tensor,
    best_f: Tensor | float,
    *,
    maximize: bool = True
) -> Tensor

Acquisition Function: Probability of Improvement.

Parameters:

Name	Type	Description	Default
`logits`	`Tensor`	as returned by Transformer (evaluation_points x batch x feature_dim)	required
`best_f`	`Tensor \| float`	best evaluation so far (the incumbent)	required
`maximize`	`bool`	whether to maximize	`True`

plot ¶

plot(
    logits: Tensor,
    ax: Axes | None = None,
    zoom_to_quantile: float | None = None,
    **kwargs: Any
) -> Axes

Plots the distribution.

sample ¶

sample(logits: Tensor, t: float = 1.0) -> Tensor

Samples values from the distribution.

Temperature t.

ucb ¶

ucb(
    logits: Tensor,
    best_f: float,
    rest_prob: float = 1 - 0.682 / 2,
    *,
    maximize: bool = True
) -> Tensor

UCB utility. Rest Prob is the amount of utility above (below) the confidence interval that is ignored.

Higher rest_prob is equivalent to lower beta in the standard GP-UCB formulation.

Parameters:

Name	Type	Description	Default
`logits`	`Tensor`	Logits, as returned by the Transformer.	required
`rest_prob`	`float`	The amount of utility above (below) the confidence interval that is ignored. The default is equivalent to using GP-UCB with `beta=1`. To get the corresponding `beta`, where `beta` is from the standard GP definition of UCB `ucb_utility = mean + beta * std`, you can use this computation: `beta = math.sqrt(2)torch.erfinv(torch.tensor(2(1-rest_prob)-1))`	`1 - 0.682 / 2`
`best_f`	`float`	Unused	required
`maximize`	`bool`	Whether to maximize.	`True`

get_bucket_limits ¶

get_bucket_limits(
    num_outputs: int,
    full_range: tuple | None = None,
    ys: Tensor | None = None,
    *,
    verbose: bool = False,
    widen_bucket_limits_factor: float | None = None
) -> Tensor

Decide for a set of bucket limits based on a distritbution of ys.

Parameters:

Name	Type	Description	Default
`num_outputs`	`int`	This is only tested for num_outputs=1, but should work for larger num_outputs as well.	required
`full_range`	`tuple \| None`	If ys is not passed, this is the range of the ys that should be used to estimate the bucket limits.	`None`
`ys`	`Tensor \| None`	If ys is passed, this is the ys that should be used to estimate the bucket limits. Do not pass full_range in this case.	`None`
`verbose`	`bool`	Unused	`False`
`widen_bucket_limits_factor`	`float \| None`	If set, the bucket limits are widened by this factor. This allows to have a slightly larger range than the actual data.	`None`