Summary of parametric distributions#

Here a summary of the main equations for each of the presented distribution functions is presented.

Choosing a distribution#

If you need help to choose a distribution type for your data, the table below may help you make a choice:

Distributionleft boundright boundleft-tailedsymmetricright-tailedscipy name
Uniform yes yes no yes no uniform
Gaussian no no no yes no norm
Lognormal yes no no no yes lognorm
Gumbel (right-tailed) no no no no yes gumbel_r
Gumbel (left-tailed) no no yes no no gumbel_l
exponential yes no no no yes expon
beta yes yes possible possible possible beta

Notation

One challenge when dealing with distributions is notation, for two main reasons: 1) the symbols used to represent random variables and parameters vary across different fields (and even within a given field); and 2) the formulation of key equations (i.e., the PDF and CDF) can vary depending on the parameterization used.

Why such variation? Let’s just say tradition, history, and stubbornness play a big role here. But more importantly, one should recognize that the equations or parameters often have physical meaning, which makes a certain formulation more logical. Symbols often must be chosen carefully to not conflict with others used in a given field.

To illustrate the point, consider the following three formulations of the PDF of the exponential distribution (along with a link to the page ):

  • Wikipedia, Excel: \(f(x) = \lambda \operatorname{exp}(-\lambda x)\)

  • Matlab: \(f(x) = \frac{1}{\mu} \operatorname{exp}(\frac{-x}{\mu})\)

  • Scipy Stats Module: \(f(x) = \operatorname{exp}(- x)\)

Note that the formulation is very different between each case, and either \(\lambda\) or \(\mu\) is used. If you are not careful when using distributions from different textbooks or software packages, it is very easy to make mistakes! The scipy stats formulation is especially striking; you will learn more about this later (location, shape and scale).

Our advice: always check the formulation of the PDF, CDF, and parameters of a distribution and be sure to use them consistently.

In this case, “consistency” can mean, for example, using the right equations to compute the distribution parameters from the moments (mean and standard deviation) of the distribution. In general, there is no “correct” formulation or set of parameters; in this book we present a set of parameters and formulations that are consistent with each other and commonly used in civil engineering and geosciences.

Statistical moments#

Below, we will list the equations for the PDF, CDF, mean, and variance for the parametric distributions we have discussed.

Uniform#

Object

Equation

PDF

\(\displaystyle f(x) = \begin{cases}\cfrac{1}{b-a} & \text{for }x \in [a,b] \\ 0 & \text{otherwise} \end{cases}\)

CDF

\(F(x)=\begin{cases}0 & \text{for } x<a \\ \cfrac{x-a}{b-a} & \text{for } x\in[a,b] 1 & \text{for } x>b\end{cases}\)

Mean and variance

\(\begin{array}{ll} E[X]=\frac{1}{2}(a+b) \\ Var[X]=\frac{1}{12}(b-a)^2 \end{array}\)

Gaussian#

Object

Equation

PDF

\(f(x) = \cfrac{1}{\sigma \sqrt{2\pi}}e^{\left(\normalsize-\cfrac{(x-\mu)^2}{2\sigma^2}\right)}\)

CDF

\(F(x) = \cfrac{1}{2}\left(1+\text{erf}\left(\cfrac{x-\mu}{\sigma\sqrt{2}}\right)\right)\)

Mean and variance

\(\begin{array}{ll} E[X] = \mu \\ Var[X] = \sigma^2 \end{array}\)

Lognormal#

Object

Equation

PDF

\(f(x) = \cfrac{1}{x \sigma \sqrt{2 \pi}}e^{\left( \normalsize-\cfrac{(ln(x)-\mu)^2}{2\sigma^2}\right)}\)

CDF

\(F(x) = \Phi\left( \cfrac{ln(x)-\mu}{\sigma} \right) = \frac{1}{2}\left[ 1+\text{erf}\left( \cfrac{ln(x)-\mu}{\sigma \sqrt{2}}\right)\right]\)

Mean and variance

\(\begin{array}{ll} E[X]=e^{\normalsize\mu + \frac{\sigma^2}{2}} \\ Var[X] = \left( e^{\normalsize\sigma^2}-1 \right)e^{2\mu + \sigma^2} \end{array}\)

Gumbel#

Object

Equation

PDF

\(f(x) = \cfrac{1}{\beta} e^{\normalsize-\left(z + e^{\normalsize-z}\right)}\text{, where }z=\cfrac{x-\alpha}{\beta}\)

CDF

\(F(x)=e^{\normalsize-e^{\normalsize-z}}\)

Mean and variance

\(\begin{array}{ll} E[X] = \alpha + \beta\gamma,\; \gamma = 0.5772 \\ Var[X] = \cfrac{\pi^2}{6}\beta^2 \end{array}\)

Exponential#

Object

Equation

PDF

\(f(x) = \lambda e^{\normalsize-\lambda x}\)

CDF

\(F(x) = 1 - e^{\normalsize-\lambda x}\)

Mean and variance

\(\begin{array}{ll} E[X] = \cfrac{1}{\lambda} \\ Var[X] = \cfrac{1}{\lambda^2} \end{array}\)

Beta#

Object

Equation

PDF

\(f(x) = \frac{x^{\alpha - 1} (1 - x)^{\beta - 1}}{B(\alpha, \beta)}\)

CDF

\(F(x) = \frac{1}{B(\alpha, \beta)} \int_0^x t^{\alpha - 1} (1 - t)^{\beta - 1} dt\)

Mean and variance

\(\begin{array}{ll} E[X] = \frac{\alpha}{\alpha + \beta} \\ Var[X] = \frac{\alpha\beta}{(\alpha + \beta)^2(\alpha+\beta+1)} \end{array}\)

Attribution

This chapter was written by Patricia Mares Nasarre, Robert Lanzafame, and Max Ramgraber. Find out more here.