bar_distribution ¶
BarDistribution ¶
Bases: Module
average_bar_distributions_into_this ¶
average_bar_distributions_into_this(
list_of_bar_distributions: Sequence[BarDistribution],
list_of_logits: Sequence[Tensor],
*,
average_logits: bool = False
) -> Tensor
:param list_of_bar_distributions: :param list_of_logits: :param average_logits: :return:
cdf ¶
Calculates the cdf of the distribution described by the logits. The cdf is scaled by the width of the bars.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
logits |
Tensor
|
tensor of shape (batch_size, ..., num_bars) with the logits describing the distribution |
required |
ys |
Tensor
|
tensor of shape (batch_size, ..., n_ys to eval) or (n_ys to eval) with the targets. |
required |
cdf_temporary ¶
Cumulative distribution function.
TODO: this already exists here, make sure to merge, at the moment still used.
get_probs_for_different_borders ¶
The logits describe the density of the distribution over the current self.borders.
This function returns the logits if the self.borders were changed to new_borders. This is useful to average the logits of different models.
icdf ¶
Implementation of the quantile function
:param logits: Tensor of any shape, with the last dimension being logits
:param left_prob: float: The probability mass to the left of the result.
:return: Position with left_prob
probability weight to the left.
mean_of_square ¶
Computes E[x^2].
Parameters:
Name | Type | Description | Default |
---|---|---|---|
logits |
Tensor
|
Output of the model. |
required |
Returns:
Type | Description |
---|---|
Tensor
|
mean of square |
pi ¶
Acquisition Function: Probability of Improvement.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
logits |
Tensor
|
as returned by Transformer |
required |
best_f |
float | Tensor
|
best evaluation so far (the incumbent) |
required |
maximize |
bool
|
whether to maximize |
True
|
Returns:
Type | Description |
---|---|
Tensor
|
probability of improvement |
plot ¶
plot(
logits: Tensor,
ax: Axes | None = None,
zoom_to_quantile: float | None = None,
**kwargs: Any
) -> Axes
Plots the distribution.
ucb ¶
ucb(
logits: Tensor,
best_f: float,
rest_prob: float = 1 - 0.682 / 2,
*,
maximize: bool = True
) -> Tensor
UCB utility. Rest Prob is the amount of utility above (below) the confidence interval that is ignored.
Higher rest_prob is equivalent to lower beta in the standard GP-UCB formulation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
logits |
Tensor
|
Logits, as returned by the Transformer. |
required |
rest_prob |
float
|
The amount of utility above (below) the confidence interval that is ignored. The default is equivalent to using GP-UCB with
|
1 - 0.682 / 2
|
best_f |
float
|
Unused |
required |
maximize |
bool
|
Whether to maximize. |
True
|
FullSupportBarDistribution ¶
Bases: BarDistribution
average_bar_distributions_into_this ¶
average_bar_distributions_into_this(
list_of_bar_distributions: Sequence[BarDistribution],
list_of_logits: Sequence[Tensor],
*,
average_logits: bool = False
) -> Tensor
:param list_of_bar_distributions: :param list_of_logits: :param average_logits: :return:
cdf ¶
Calculates the cdf of the distribution described by the logits. The cdf is scaled by the width of the bars.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
logits |
Tensor
|
tensor of shape (batch_size, ..., num_bars) with the logits describing the distribution |
required |
ys |
Tensor
|
tensor of shape (batch_size, ..., n_ys to eval) or (n_ys to eval) with the targets. |
required |
cdf_temporary ¶
Cumulative distribution function.
TODO: this already exists here, make sure to merge, at the moment still used.
ei_for_halfnormal ¶
EI for a standard normal distribution with mean 0 and variance scale
times 2.
Which is the same as the half normal EI. Tested this with MC approximation:
ei_for_halfnormal = lambda scale, best_f: (torch.distributions.HalfNormal(torch.tensor(scale)).sample((10_000_000,))- best_f ).clamp(min=0.).mean()
print([(ei_for_halfnormal(scale,best_f), FullSupportBarDistribution().ei_for_halfnormal(scale,best_f)) for scale in [0.1,1.,10.] for best_f in [.1,10.,4.]])
forward ¶
Returns the negative log density (the loss).
y: T x B, logits: T x B x self.num_bars.
:param logits: Tensor of shape T x B x self.num_bars :param y: Tensor of shape T x B :param mean_prediction_logits: :return:
get_probs_for_different_borders ¶
The logits describe the density of the distribution over the current self.borders.
This function returns the logits if the self.borders were changed to new_borders. This is useful to average the logits of different models.
icdf ¶
Implementation of the quantile function
:param logits: Tensor of any shape, with the last dimension being logits
:param left_prob: float: The probability mass to the left of the result.
:return: Position with left_prob
probability weight to the left.
mean_of_square ¶
Computes E[x^2].
Parameters:
Name | Type | Description | Default |
---|---|---|---|
logits |
Tensor
|
Output of the model. |
required |
pi ¶
Acquisition Function: Probability of Improvement.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
logits |
Tensor
|
as returned by Transformer (evaluation_points x batch x feature_dim) |
required |
best_f |
Tensor | float
|
best evaluation so far (the incumbent) |
required |
maximize |
bool
|
whether to maximize |
True
|
plot ¶
plot(
logits: Tensor,
ax: Axes | None = None,
zoom_to_quantile: float | None = None,
**kwargs: Any
) -> Axes
Plots the distribution.
sample ¶
Samples values from the distribution.
Temperature t.
ucb ¶
ucb(
logits: Tensor,
best_f: float,
rest_prob: float = 1 - 0.682 / 2,
*,
maximize: bool = True
) -> Tensor
UCB utility. Rest Prob is the amount of utility above (below) the confidence interval that is ignored.
Higher rest_prob is equivalent to lower beta in the standard GP-UCB formulation.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
logits |
Tensor
|
Logits, as returned by the Transformer. |
required |
rest_prob |
float
|
The amount of utility above (below) the confidence interval that is ignored. The default is equivalent to using GP-UCB with
|
1 - 0.682 / 2
|
best_f |
float
|
Unused |
required |
maximize |
bool
|
Whether to maximize. |
True
|
get_bucket_limits ¶
get_bucket_limits(
num_outputs: int,
full_range: tuple | None = None,
ys: Tensor | None = None,
*,
verbose: bool = False,
widen_bucket_limits_factor: float | None = None
) -> Tensor
Decide for a set of bucket limits based on a distritbution of ys.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
num_outputs |
int
|
This is only tested for num_outputs=1, but should work for larger num_outputs as well. |
required |
full_range |
tuple | None
|
If ys is not passed, this is the range of the ys that should be used to estimate the bucket limits. |
None
|
ys |
Tensor | None
|
If ys is passed, this is the ys that should be used to estimate the bucket limits. Do not pass full_range in this case. |
None
|
verbose |
bool
|
Unused |
False
|
widen_bucket_limits_factor |
float | None
|
If set, the bucket limits are widened by this factor. This allows to have a slightly larger range than the actual data. |
None
|