asv_runner.statistics
#
Module Contents#
Classes#
Class to represent univariate Laplace posterior distribution. |
Functions#
Computes an ‘error measure’ based on the interquartile range of the measurement results. |
|
Computes the Probability Mass Function (PMF) for a binomial distribution. |
|
Computes a quantile/percentile from a dataset. |
|
Compute a quantile and a confidence interval for a given dataset. |
|
Performs statistical analysis on the provided samples. |
API#
- asv_runner.statistics.get_err(result, stats)#
Computes an ‘error measure’ based on the interquartile range of the measurement results.
Parameters
- result (
any
) The measurement results. Currently unused.
- stats (
dict
) A dictionary of statistics computed from the measurement results. It should contain the keys “q_25” and “q_75” representing the 25th and 75th percentiles respectively.
Returns
- error (
float
) The error measure, defined as half the interquartile range (i.e., (Q3 - Q1) / 2).
- result (
- asv_runner.statistics.binom_pmf(n, k, p)#
Computes the Probability Mass Function (PMF) for a binomial distribution.
Parameters
- n (
int
) The number of trials in the binomial distribution.
- k (
int
) The number of successful trials.
- p (
float
) The probability of success on each trial.
Returns
- pmf (
float
) The binomial PMF computed as (n choose k) * pk * (1 - p)(n - k).
Notes
Handles edge cases where p equals 0 or 1.
- n (
- asv_runner.statistics.quantile(x, q)#
Computes a quantile/percentile from a dataset.
Parameters
- x (
list
offloat
) The dataset for which the quantile is to be computed.
- q (
float
) The quantile to compute. Must be in the range [0, 1].
Returns
- m (
float
) The computed quantile from the dataset.
Raises
- ValueError
If the provided quantile q is not in the range [0, 1].
Notes
This function sorts the input data and calculates the quantile using a linear interpolation method if the desired quantile lies between two data points.
- x (
- asv_runner.statistics.quantile_ci(x, q, alpha_min=0.01)#
Compute a quantile and a confidence interval for a given dataset.
Parameters
- x (
list
offloat
) The dataset from which the quantile and confidence interval are computed.
- q (
float
) The quantile to compute. Must be in the range [0, 1].
- alpha_min (
float
, optional) Limit for coverage. The result has coverage >= 1 - alpha_min. Defaults to 0.01.
Returns
- m (
float
) The computed quantile from the dataset.
- ci (
tuple
offloat
) Confidence interval (a, b), of coverage >= alpha_min.
Notes
This function assumes independence but is otherwise nonparametric. It sorts the input data and calculates the quantile using a linear interpolation method if the desired quantile lies between two data points. The confidence interval is computed using a known property of the cumulative distribution function (CDF) of a binomial distribution. This method calculates the smallest range (y[r-1], y[s-1]) for which the coverage is at least alpha_min.
- x (
- class asv_runner.statistics.LaplacePosterior(y, nu=None)#
Class to represent univariate Laplace posterior distribution.
Description
This class represents the univariate posterior distribution defined as
p(beta|y) = N [sum(|y_j - beta|)]**(-nu-1)
where N is the normalization factor.Parameters
- y (
list
offloat
) A list of sample values from the distribution.
- nu (
float
, optional) Degrees of freedom. Default is
len(y) - 1
.
Attributes
- mle (
float
) The maximum likelihood estimate for beta which is the median of y.
Notes
This is the posterior distribution in the Bayesian model assuming Laplace distributed noise, where
p(y|beta,sigma) = N exp(- sum_j (1/sigma) |y_j - beta|)
,p(sigma) ~ 1/sigma
, andnu = len(y) - 1
. The MLE for beta ismedian(y)
. Applying the same approach to a Gaussian model results top(beta|y) = N T(t, m-1)
,t = (beta - mean(y)) / (sstd(y) / sqrt(m))
whereT(t, nu)
is the Student t-distribution pdf, which gives the standard textbook formulas for the mean.Initialization
Initializes an instance of the LaplacePosterior class.
Parameters
- y (
list
offloat
): The samples from the distribution.
- nu (
float
, optional): The degrees of freedom. Default is
len(y) - 1
.
Raises
ValueError
: Ify
is an empty list.Notes
This constructor sorts the input data
y
and calculates the MLE (Maximum Likelihood Estimate). It computes a scale factor,_y_scale
, to prevent overflows when computing unnormalized CDF integrals. The input datay
is then shifted and scaled according to this computed scale. The method also initializes a memoization dictionary_cdf_memo
for the unnormalized CDF, and a placeholder_cdf_norm
for the normalization constant of the CDF.- _cdf_unnorm(beta)#
Computes the unnormalized cumulative distribution function (CDF).
Parameters
- beta (
float
): The upper limit of the integration for the CDF.
Returns
Returns the unnormalized CDF evaluated at
beta
.Notes
The method computes the unnormalized CDF as:
cdf_unnorm(b) = int_{-oo}^{b} 1/(sum_j |y - b'|)**(m+1) db'
The method integrates piecewise, resolving the absolute values separately for each section. The results of these calculations are memoized to speed up subsequent computations.
It also handles special cases, such as when
beta
is not a number (returnsbeta
as is), or whenbeta
is positive infinity (memoizes the integral value at the end of the listy
).- beta (
- _ppf_unnorm(cdfx)#
Computes the inverse function of
_cdf_unnorm
.Parameters
- cdfx (
float
): The value for which to compute the inverse cumulative distribution function (CDF).
Returns
Returns the unnormalized quantile function evaluated at
cdfx
.Notes
This method computes the inverse of
_cdf_unnorm
. It first finds the interval within whichcdfx
lies, then performs the inversion on this interval.Special cases are handled when the interval index
k
is 0 (the computation ofbeta
involves a check for negative infinity), or when the calculatedc
is 0. The resultbeta
is clipped at the upper bound of the interval, ensuring it does not exceedself.y[k]
.- cdfx (
- pdf(beta)#
Computes the probability distribution function (PDF).
Parameters
- beta (
float
) The point at which to evaluate the PDF.
Returns
A
float
which is the probability density function evaluated atbeta
.Notes
This function computes the PDF by exponentiating the result of
self.logpdf(beta)
. Thelogpdf
method should therefore be implemented in the class that uses this method.- beta (
- logpdf(beta)#
Computes the logarithm of the probability distribution function (log-PDF).
Parameters
- beta (
float
) The point at which to evaluate the log-PDF.
Returns
A
float
which is the logarithm of the probability density function evaluated atbeta
.Notes
This function computes the log-PDF by first checking if the scale of the distribution
_y_scale
is zero. If so, it returnsmath.inf
ifbeta
equals the maximum likelihood estimatemle
, otherwise it returns-math.inf
.The
beta
value is then transformed by subtracting the maximum likelihood estimatemle
and dividing by_y_scale
.If the unnormalized cumulative distribution function
_cdf_norm
has not been computed yet, it is computed by calling_cdf_unnorm(math.inf)
.The function then computes the sum of absolute differences between
beta
and all points iny
, applies the log-PDF formula and returns the result.- beta (
- cdf(beta)#
Computes the cumulative distribution function (CDF).
Parameters
- beta (
float
) The point at which to evaluate the CDF.
Returns
A
float
which is the value of the cumulative distribution function evaluated atbeta
.Notes
This function computes the CDF by first checking if the scale of the distribution
_y_scale
is zero. If so, it returns 1 ifbeta
is greater than the maximum likelihood estimatemle
, and 0 otherwise.The
beta
value is then transformed by subtracting the maximum likelihood estimatemle
and dividing by_y_scale
.If the unnormalized cumulative distribution function
_cdf_norm
has not been computed yet, it is computed by calling_cdf_unnorm(math.inf)
.The function then computes the unnormalized CDF at
beta
and normalizes it by dividing with_cdf_norm
.- beta (
- ppf(cdf)#
Computes the percent point function (PPF), also known as the inverse cumulative distribution function.
Parameters
- cdf (
float
) The cumulative probability for which to compute the inverse CDF. It must be between 0 and 1 (inclusive).
Returns
A
float
which is the value of the percent point function evaluated atcdf
.Notes
This function computes the PPF by first checking if
cdf
is not between 0 and 1. If it is not, it returnsmath.nan
.If the scale of the distribution
_y_scale
is zero, it returns the maximum likelihood estimatemle
.If the unnormalized cumulative distribution function
_cdf_norm
has not been computed yet, it is computed by calling_cdf_unnorm(math.inf)
.The function then scales
cdf
by_cdf_norm
(making sure it does not exceed_cdf_norm
), computes the unnormalized PPF at this scaled value, and transforms it back to the original scale.- cdf (
- y (
- asv_runner.statistics.compute_stats(samples, number)#
Performs statistical analysis on the provided samples.
Parameters
- samples (
list
offloat
) A list of total times (in seconds) of benchmarks.
- number (
int
) The number of times each benchmark was repeated.
Returns
- beta_hat (
float
) The estimated time per iteration.
- stats (
dict
) A dictionary containing various statistical measures of the estimator. It includes:
“ci_99_a”: The lower bound of the 99% confidence interval.
“ci_99_b”: The upper bound of the 99% confidence interval.
“q_25”: The 25th percentile of the sample times.
“q_75”: The 75th percentile of the sample times.
“repeat”: The total number of samples.
“number”: The repeat number for each sample.
Notes
This function first checks if there are any samples. If there are none, it returns
None, None
.It then calculates the median and the 25th and 75th percentiles of the samples. If the nonparametric confidence interval estimation did not provide an estimate, it computes the posterior distribution for the location, assuming exponential noise. The Maximum Likelihood Estimate (MLE) is equal to the median. The function uses the confidence interval from that distribution to extend beyond the sample bounds if necessary.
Finally, it produces the median as the result and a dictionary of the computed statistics.
- samples (