Beta distribution

Beta distribution#

Beta distributions are a very useful choice for univariate variables that have a natural lower and upper bound. By default, a beta distribution is defined for \(0 \leq x \leq 1\) as

\[ f(x) = \frac{x^{\alpha - 1} (1 - x)^{\beta - 1}}{B(\alpha, \beta)}, \]

where \(B(\alpha, \beta)\) is the Beta function. This Beta function is, defined as

\[ B(\alpha, \beta) = \int_0^1 t^{\alpha - 1} (1 - t)^{\beta - 1} dt = \frac{\Gamma(\alpha) \Gamma(\beta)}{\Gamma(\alpha + \beta)}, \]

where \(\Gamma\) is the Gamma function. The parameters \(\alpha\) and \(\beta\) must both be larger or equal to one (\(\alpha,\beta \geq 1\)) and affect the shape of the pdf: high values of \(\alpha\) shift the pdf towards the right, high values of \(\beta\) shift the pdf towards the left.

The cdf of the Beta distribution describes the probability that a Beta-distributed random variable \(X\) takes a value less than or equal to some \(x \in [0, 1]\). It is defined as

\[ F(x; \alpha, \beta) = \int_0^x f(t) \, dt = \frac{1}{B(\alpha, \beta)} \int_0^x t^{\alpha - 1} (1 - t)^{\beta - 1} dt, \]

where \(f(t)\) is the probability density function given above.

Interactive Element#

Below, you find an interactive element that shows the pdf and cdf of a beta distribution. The element also includes sliders for the location and scale which allow us to scale this element to intervals other than \([0,1]\).

Fig. 3.22 Interactively visualize the relationship between the PDF and the CDF of a beta distribution.#

Interesting Properties#

The beta pdf can be useful in many settings you will encounter in your professional lives. Beta pdfs (and their multivariate counterpart, Dirichlet pdfs) are often used for simplex variables, which are variables that must add up to a certain value. Examples of simplex variables include

  • the sea ice area fraction, which describes the percentage of a certain area of sea that is covered by sea ice (naturally, the open and ice-covered percentages must add to 100%).

  • the expected value of a coin flip, the probability that a flipped coin comes up either head (zero) or tails (one). Since a flipped coin only has binary outcomes (head or tails; we neglect the chance that it may land on its side), the expected value of repeated coin flipping must lie between zero (a loaded coin that only ever lands on “head”) and one (a loaded coin that only ever lands on “tails”).

  • similarly, this distribution can be used for other binary choices, such as the path drivers choose when encountering a fork in the road. If we designate left as zero and right as one, then the average choice drivers take must be between zero (all drivers go left) through 0.5 (exactly half the drivers go left and half go right) to one (all drivers go right). More generally, beta pdfs can also be rescaled (by adjusting the location and scale parameters) and used for variables that have natural lower and upper bounds, such as dissolved concentrations of a chemical compound. Such concentrations can never be negative (i.e., \(x \geq 0\)) but cannot exceed some saturation concentration (\(x \leq C_{\text{sat.}}\)), which makes the beta distribution a natural choice for such quantities.


% START-CREDIT
% source: distributions
```{attributiongrey} Attribution
:class: attribution
This chapter was written by Max Ramgraber. {ref}`Find out more here <distributions_credit>`.