Probability Distributions¶
See also
Distribution
- Abstract Base Class for Probability Distributions¶
-
class
qinfer.
Distribution
[source]¶ Bases:
object
Abstract base class for probability distributions on one or more random variables.
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
Specific Distributions¶
-
class
qinfer.
UniformDistribution
(ranges=array([[0, 1]]))[source]¶ Bases:
qinfer.distributions.Distribution
Uniform distribution on a given rectangular region.
Parameters: ranges (numpy.ndarray) – Array of shape (n_rvs, 2)
, wheren_rvs
is the number of random variables, specifying the upper and lower limits for each variable.-
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
-
class
qinfer.
DiscreteUniformDistribution
(num_bits)[source]¶ Bases:
qinfer.distributions.Distribution
Discrete uniform distribution over the integers between
0
and2**num_bits-1
inclusive.Parameters: num_bits (int) – non-negative integer specifying how big to make the interval. -
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
-
class
qinfer.
MVUniformDistribution
(dim=6)[source]¶ Bases:
qinfer.distributions.Distribution
Uniform distribution over the rectangle \([0,1]^{\text{dim}}\) with the restriction that vector must sum to 1. Equivalently, a uniform distribution over the
dim-1
simplex whose vertices are the canonical unit vectors of \(\mathbb{R}^\text{dim}\).Parameters: dim (int) – Number of dimensions; n_rvs
.-
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
-
class
qinfer.
NormalDistribution
(mean, var, trunc=None)[source]¶ Bases:
qinfer.distributions.Distribution
Normal or truncated normal distribution over a single random variable.
Parameters: -
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
-
class
qinfer.
MultivariateNormalDistribution
(mean, cov)[source]¶ Bases:
qinfer.distributions.Distribution
Multivariate (vector-valued) normal distribution.
Parameters: - mean (np.ndarray) – Array of shape
(n_rvs, )
representing the mean of the distribution. - cov (np.ndarray) – Array of shape
(n_rvs, n_rvs)
representing the covariance matrix of the distribution.
-
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
- mean (np.ndarray) – Array of shape
-
class
qinfer.
SlantedNormalDistribution
(ranges=array([[0, 1]]), weight=0.01)[source]¶ Bases:
qinfer.distributions.Distribution
Uniform distribution on a given rectangular region with additive noise. Random variates from this distribution follow \(X+Y\) where \(X\) is drawn uniformly with respect to the rectangular region defined by ranges, and \(Y\) is normally distributed about 0 with variance
weight**2
.Parameters: - ranges (numpy.ndarray) – Array of shape
(n_rvs, 2)
, wheren_rvs
is the number of random variables, specifying the upper and lower limits for each variable. - weight (float) – Number specifying the inverse variance of the additive noise term.
-
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
- ranges (numpy.ndarray) – Array of shape
-
class
qinfer.
LogNormalDistribution
(mu=0, sigma=1)[source]¶ Bases:
qinfer.distributions.Distribution
Log-normal distribution.
Parameters: - mu – Location parameter (numeric), set to 0 by default.
- sigma – Scale parameter (numeric), set to 1 by default. Must be strictly greater than zero.
-
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
class
qinfer.
ConstantDistribution
(values)[source]¶ Bases:
qinfer.distributions.Distribution
Represents a determinstic variable; useful for combining with other distributions, marginalizing, etc.
Parameters: values – Shape (n,)
array or list of values \(X_0\) such that \(\Pr(X) = \delta(X - X_0)\).-
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
-
class
qinfer.
BetaDistribution
(alpha=None, beta=None, mean=None, var=None)[source]¶ Bases:
qinfer.distributions.Distribution
The beta distribution, whose pdf at \(x\) is proportional to \(x^{\alpha-1}(1-x)^{\beta-1}\). Note that either
alpha
andbeta
, ormean
andvar
, must be specified as inputs; either case uniquely determines the distribution.Parameters: -
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
-
class
qinfer.
BetaBinomialDistribution
(n, alpha=None, beta=None, mean=None, var=None)[source]¶ Bases:
qinfer.distributions.Distribution
The beta-binomial distribution, whose pmf at the non-negative integer \(k\) is equal to \(\binom{n}{k}\frac{B(k+\alpha,n-k+\beta)}{B(\alpha,\beta)}\) with \(B(\cdot,\cdot)\) the beta function. This is the compound distribution whose variates are binomial distributed with a bias chosen from a beta distribution. Note that either
alpha
andbeta
, ormean
andvar
, must be specified as inputs; either case uniquely determines the distribution.Parameters: - n (int) – The \(n\) parameter of the beta-binomial distribution.
- alpha (float) – The alpha shape parameter of the beta-binomial distribution.
- beta (float) – The beta shape parameter of the beta-binomial distribution.
- mean (float) – The desired mean value of the beta-binomial distribution.
- var (float) – The desired variance of the beta-binomial distribution.
-
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
class
qinfer.
DirichletDistribution
(alpha)[source]¶ Bases:
qinfer.distributions.Distribution
The dirichlet distribution, whose pdf at \(x\) is proportional to \(\prod_i x_i^{\alpha_i-1}\).
Parameters: alpha – The list of concentration parameters. -
alpha
¶
-
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
-
class
qinfer.
GammaDistribution
(alpha=None, beta=None, mean=None, var=None)[source]¶ Bases:
qinfer.distributions.Distribution
The gamma distribution, whose pdf at \(x\) is proportional to \(x^{-\alpha-1}e^{-x\beta}\). Note that either alpha and beta, or mean and var, must be specified as inputs; either case uniquely determines the distribution.
Parameters: -
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
-
class
qinfer.
InterpolatedUnivariateDistribution
(pdf, compactification_scale=1, n_interp_points=1500)[source]¶ Bases:
qinfer.distributions.Distribution
Samples from a single-variable distribution specified by its PDF. The samples are drawn by first drawing uniform samples over the interval
[0, 1]
, and then using an interpolation of the inverse-CDF corresponding to the given PDF to transform these samples into the desired distribution.Parameters: -
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
-
class
qinfer.
HilbertSchmidtUniform
(dim=2)[source]¶ Bases:
qinfer.distributions.SingleSampleMixin
,qinfer.distributions.Distribution
Creates a new Hilber-Schmidt uniform prior on state space of dimension
dim
. See e.g. [Mez06] and [Mis12].Parameters: dim (int) – Dimension of the state space. -
n_rvs
¶
-
sample
()[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
-
class
qinfer.
HaarUniform
(dim=2)[source]¶ Bases:
qinfer.distributions.SingleSampleMixin
,qinfer.distributions.Distribution
Haar uniform distribution of pure states of dimension
dim
, parameterized as coefficients of the Pauli basis.Parameters: dim (int) – Dimension of the state space. Note
This distribution presently only works for
dim==2
and the Pauli basis.-
n_rvs
¶
-
-
class
qinfer.
GinibreUniform
(dim=2, k=2)[source]¶ Bases:
qinfer.distributions.SingleSampleMixin
,qinfer.distributions.Distribution
Creates a prior on state space of dimension dim according to the Ginibre ensemble with parameter
k
. See e.g. [Mis12].Parameters: dim (int) – Dimension of the state space. -
n_rvs
¶
-
-
class
qinfer.
ParticleDistribution
(n_mps=None, particle_locations=None, particle_weights=None)[source]¶ Bases:
qinfer.distributions.Distribution
A distribution consisting of a list of weighted vectors. Note that either
n_mps
or both (particle_locations
,particle_weights
) must be specified, or an error will be raised.Parameters: - particle_weights (numpy.ndarray) – Length
n_particles
list of particle weights. - particle_locations – Shape
(n_particles, n_mps)
array of particle locations. - n_mps (int) – Dimension of parameter space. This parameter should
only be set when
particle_weights
andparticle_locations
are not set (and vice versa).
-
n_ess
¶ Returns the effective sample size (ESS) of the current particle distribution.
Type: float
Returns: The effective sample size, given by \(1/\sum_i w_i^2\).
-
sample
(n=1)[source]¶ Returns random samples from the current particle distribution according to particle weights.
Parameters: n (int) – The number of samples to draw. Returns: The sampled model parameter vectors. Return type: ndarray
of shape(n, updater.n_rvs)
.
-
static
particle_mean
(weights, locations)[source]¶ Returns the arithmetic mean of the
locations
weighted byweights
Parameters: - weights (numpy.ndarray) – Weights of each particle in array of
shape
(n_particles,)
. - locations (numpy.ndarray) – Locations of each particle in array
of shape
(n_particles, n_modelparams)
Return type: numpy.ndarray
, shape(n_modelparams,)
.Returns: An array containing the mean
- weights (numpy.ndarray) – Weights of each particle in array of
shape
-
classmethod
particle_covariance_mtx
(weights, locations)[source]¶ Returns an estimate of the covariance of a distribution represented by a given set of SMC particle.
Parameters: - weights – An array of shape
(n_particles,)
containing the weights of each particle. - location – An array of shape
(n_particles, n_modelparams)
containing the locations of each particle.
Return type: numpy.ndarray
, shape(n_modelparams, n_modelparams)
.Returns: An array containing the estimated covariance matrix.
- weights – An array of shape
-
est_mean
()[source]¶ Returns the mean value of the current particle distribution.
Return type: numpy.ndarray
, shape(n_mps,)
.Returns: An array containing the an estimate of the mean model vector.
-
est_meanfn
(fn)[source]¶ Returns an the expectation value of a given function \(f\) over the current particle distribution.
Here, \(f\) is represented by a function
fn
that is vectorized over particles, such thatf(modelparams)
has shape(n_particles, k)
, wheren_particles = modelparams.shape[0]
, and wherek
is a positive integer.Parameters: fn (callable) – Function implementing \(f\) in a vectorized manner. (See above.) Return type: numpy.ndarray
, shape(k, )
.Returns: An array containing the an estimate of the mean of \(f\).
-
est_covariance_mtx
(corr=False)[source]¶ Returns the full-rank covariance matrix of the current particle distribution.
Parameters: corr (bool) – If True
, the covariance matrix is normalized by the outer product of the square root diagonal of the covariance matrix, i.e. the correlation matrix is returned instead.Return type: numpy.ndarray
, shape(n_modelparams, n_modelparams)
.Returns: An array containing the estimated covariance matrix.
-
est_entropy
()[source]¶ Estimates the entropy of the current particle distribution as \(-\sum_i w_i \log w_i\) where \(\{w_i\}\) is the set of particles with nonzero weight.
-
est_kl_divergence
(other, kernel=None, delta=0.01)[source]¶ Finds the KL divergence between this and another particle distribution by using a kernel density estimator to smooth over the other distribution’s particles.
Parameters: other (SMCUpdater) –
-
est_cluster_metric
(cluster_opts=None)[source]¶ Returns an estimate of how much of the variance in the current posterior can be explained by a separation between clusters.
-
est_credible_region
(level=0.95, return_outside=False, modelparam_slice=None)[source]¶ Returns an array containing particles inside a credible region of a given level, such that the described region has probability mass no less than the desired level.
Particles in the returned region are selected by including the highest- weight particles first until the desired credibility level is reached.
Parameters: Return type: numpy.ndarray
, shape(n_credible, n_mps)
, wheren_credible
is the number of particles in the credible region andn_mps
corresponds to the size ofmodelparam_slice
.If
return_outside
isTrue
, this method instead returns tuple(inside, outside)
whereinside
is as described above, andoutside
has shape(n_particles-n_credible, n_mps)
.Returns: An array of particles inside the estimated credible region. Or, if
return_outside
isTrue
, both the particles inside and the particles outside, as a tuple.
-
region_est_hull
(level=0.95, modelparam_slice=None)[source]¶ Estimates a credible region over models by taking the convex hull of a credible subset of particles.
Parameters: Returns: The tuple
(faces, vertices)
wherefaces
describes all the vertices of all of the faces on the exterior of the convex hull, andvertices
is a list of all vertices on the exterior of the convex hull.Return type: faces
is anumpy.ndarray
with shape(n_face, n_mps, n_mps)
and indeces(idx_face, idx_vertex, idx_mps)
wheren_mps
corresponds to the size ofmodelparam_slice
.vertices
is annumpy.ndarray
of shape(n_vertices, n_mps)
.
-
region_est_ellipsoid
(level=0.95, tol=0.0001, modelparam_slice=None)[source]¶ Estimates a credible region over models by finding the minimum volume enclosing ellipse (MVEE) of a credible subset of particles.
Parameters: Returns: A tuple
(A, c)
whereA
is the covariance matrix of the ellipsoid andc
is the center. A point \(\vec{x}\) is in the ellipsoid whenever \((\vec{x}-\vec{c})^{T}A^{-1}(\vec{x}-\vec{c})\leq 1\).Return type: A
isnp.ndarray
of shape(n_mps,n_mps)
andcentroid
isnp.ndarray
of shape(n_mps)
.n_mps
corresponds to the size ofparam_slice
.
-
in_credible_region
(points, level=0.95, modelparam_slice=None, method='hpd-hull', tol=0.0001)[source]¶ Decides whether each of the points lie within a credible region of the current distribution.
If
tol
isNone
, the particles are tested directly against the convex hull object. Iftol
is a positivefloat
, particles are tested to be in the interior of the smallest enclosing ellipsoid of this convex hull, seeSMCUpdater.region_est_ellipsoid()
.Parameters: - points (np.ndarray) – An
np.ndarray
of shape(n_mps)
for a single point, or of shape(n_points, n_mps)
for multiple points, wheren_mps
corresponds to the same dimensionality asparam_slice
. - level (float) – The desired crediblity level (see
SMCUpdater.est_credible_region()
). - method (str) – A string specifying which credible region estimator to
use. One of
'pce'
,'hpd-hull'
or'hpd-mvee'
(see below). - tol (float) – The allowed error tolerance for those methods
which require a tolerance (see
mvee()
). - modelparam_slice (slice) – A slice describing which model parameters to consider in the credible region, effectively marginizing out the remaining parameters. By default, all model parameters are included.
Returns: A boolean array of shape
(n_points, )
specifying whether each of the points lies inside the confidence region.The following values are valid for the
method
argument.'pce'
: Posterior Covariance Ellipsoid.- Computes the covariance
matrix of the particle distribution marginalized over the excluded
slices and uses the \(\chi^2\) distribution to determine
how to rescale it such the the corresponding ellipsoid has
the correct size. The ellipsoid is translated by the
mean of the particle distribution. It is determined which
of the
points
are on the interior.
'hpd-hull'
: High Posterior Density Convex Hull.- See
SMCUpdater.region_est_hull()
. Computes the HPD region resulting from the particle approximation, computes the convex hull of this, and it is determined which of thepoints
are on the interior.
'hpd-mvee'
: High Posterior Density Minimum Volume Enclosing Ellipsoid.- See
SMCUpdater.region_est_ellipsoid()
andmvee()
. Computes the HPD region resulting from the particle approximation, computes the convex hull of this, and determines the minimum enclosing ellipsoid. Deterimines which of thepoints
are on the interior.
- points (np.ndarray) – An
- particle_weights (numpy.ndarray) – Length
Combining Distributions¶
QInfer also offers classes for combining distributions together to produce new ones.
-
class
qinfer.
ProductDistribution
(*factors)[source]¶ Bases:
qinfer.distributions.Distribution
Takes a non-zero number of QInfer distributions \(D_k\) as input and returns their Cartesian product.
In other words, the returned distribution is \(\Pr(D_1, \dots, D_N) = \prod_k \Pr(D_k)\).
Parameters: factors (Distribution) – Distribution objects representing \(D_k\). Alternatively, one iterable argument can be given, in which case the factors are the values drawn from that iterator. -
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
-
-
class
qinfer.
PostselectedDistribution
(distribution, model, maxiters=100)[source]¶ Bases:
qinfer.distributions.Distribution
Postselects a distribution based on validity within a given model.
-
n_rvs
¶
-
-
class
qinfer.
MixtureDistribution
(weights, dist, dist_args=None, dist_kw_args=None, shuffle=True)[source]¶ Bases:
qinfer.distributions.Distribution
Samples from a weighted list of distributions.
Parameters: - weights – Length
n_dist
list ornp.ndarray
of probabilites summing to 1. - dist – Either a length
n_dist
list ofDistribution
instances, or aDistribution
class, for example,NormalDistribution
. It is assumed that a list ofDistribution``s all have the same ``n_rvs
. - dist_args – If
dist
is a class, an array of shape(n_dist, n_rvs)
wheredist_args[k,:]
defines the arguments of the k’th distribution. UseNone
if the distribution has no arguments. - dist_kw_args – If
dist
is a class, a dictionary where each key’s value is an array of shape(n_dist, n_rvs)
wheredist_kw_args[key][k,:]
defines the keyword argument corresponding tokey
of the k’th distribution. UseNone
if the distribution needs no keyword arguments. - shuffle (bool) – Whether or not to shuffle result after sampling. Not shuffling
will result in variates being in the same order as
the distributions. Default is
True
.
-
n_rvs
¶
-
n_dist
¶ The number of distributions in the mixture distribution.
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.
- weights – Length
-
class
qinfer.
ConstrainedSumDistribution
(underlying_distribution, desired_total=1)[source]¶ Bases:
qinfer.distributions.Distribution
Samples from an underlying distribution and then enforces that all samples must sum to some given value by normalizing each sample.
Parameters: - underlying_distribution (Distribution) – Underlying probability distribution.
- desired_total (float) – Desired sum of each sample.
-
underlying_distribution
¶
-
n_rvs
¶
-
sample
(n=1)[source]¶ Returns one or more samples from this probability distribution.
Parameters: n (int) – Number of samples to return. Return type: numpy.ndarray Returns: An array containing samples from the distribution of shape (n, d)
, whered
is the number of random variables.