Transforming random variables

5.1. Transforming random variables#

Let us take the simple example of converting temperature measurements taken in degrees Celsius to degrees Fahrenheit. This transformation is represented by a simple linear function

\[ T_f = q(T_c) = \frac{9}{5} T_c + 32 \]

In Fig. 5.2, an example of the distribution of the average July temperature in a city is illustrated, both in degrees Celsius and degrees Fahrenheit. Due to a simple change of units, the PDF is transformed, the mean is shifted and the variance is ultimately also changed.

https://files.mude.citg.tudelft.nl/01_Temp.png — Fig. 5.2 Distribution of temperature in degrees Celsius and degrees Fahrenheit.#

The previous toy problem shows how even the simplest transformation (i.e., a linear function) can alter the distribution of output variables. However, in some cases we are not interested in evaluating the complete PDF of the output distribution, but we could limit ourselves to some principal statistical moments of the distribution. Before discussing that, we provide a more general theory concerning the transformation of random variables.

Tip

In the previous problem, try to think about the difference between the expected values \(\mathbb{E}{T_c}\) and \(\mathbb{E}{T_f}\). This difference is not “32” since the Celsius and Fahrenheit units have a different scale, therefore 1C° is not equivalent to 1F°!

Generic transformation of univariate functions#

We try to determine the distribution of \(Z = g(X)\) given the distribution of \(X\) and the function \(g\). The CDF of \(Z\) is defined as

\[ F_{Z}(z) = P(Z \leq z) = P(g(X) \leq z) = P( X \in I_{z} ) \]

where \(I_{z} = \{x\in \mathbb{R} \; | \; g(x) \leq z\}\) is a set of all \(x\) that satisfy the inequality \(g(x) \leq z\) for a given \(z\).

Example \(z = g(x) = ax + b\)

We discriminate between three cases:

Case \(a > 0\) (i.e., \(g\) is increasing)

\[ F_{Z}(z) = P( Z \leq z ) = P \left( X \leq \frac{z-b}{a} \right) = F_{X} \left( \frac{z-b}{a} \right) \]
Case \(a < 0\) (i.e., \(g\) is decreasing)

\[ F_{Z}(z) = P \left( X \geq \frac{z-b}{a} \right) = 1 - F_{X} \left( \frac{z-b}{a} \right) \]
Case \(a = 0\) (i.e., \(g\) is constant)

\[\begin{split} F_{Z}(z) = P( Z \leq z ) = \begin{cases} 1, & z \ge b,\\ 0, & z < b~. \end{cases} \end{split}\]

where note that in the last case all \(x\) values are mapped to the same value \(b\).

Ultimately, if \(F_{Z}(z)\) is differentiable, we can differentiate the aforementioned expression, leading to

\[ f_{Z}(z) = \frac{d}{dz} \int_{I_z} f_{X} (\beta) d\beta \]

which shows in a general way how the PDF of \(Z\) can be obtained from the PDF of \(X\). For specific functions monotonically increasing or decreasing in a given interval \(A \subset \mathbb{R}\), it is possible to define a transformation rule to express the PDF of \(Z = g(X)\) in the PDF of \(X\).

Remark on the multivariate case#

In the multivariate case this is also possible, but this same analytic procedure becomes more complicated and involves continuous partial derivatives with non-vanishing Jacobian on \(A\) and requires \(f_{X}(\mathbf{x})\) being continuous on \(A\). This however goes beyond the scope of this course, therefore in the following sections we will focus on propagation of principal moments of the distribution, thus looking at propagation laws for Mean (first raw moment) and Variance (second central moment).

Theorem (Expectation law)#

For \(\mathbf{X} \in \mathbb{R}^n\) being an \(n\)-dimensional random vector with continuous PDF \(f_{\mathbf{X}}(\mathbf{x})\), we consider \(\mathbf{Z} = \mathbf{g}(\mathbf{X})\), where \(\mathbf{g}: \mathbb{R}^n \rightarrow \mathbb{R}^m\) has continuous first partial derivatives. Then the expectation of \(\mathbf{Z}\) is

\[ \mathbb{E}(\mathbf{Z}) = \mathbb{E}( \mathbf{g}(\mathbf{X}) ) = \int_{\mathbb{R}^n} \mathbf{g}(x) f_{\mathbf{X}}(x) dx \]

Corollary (Variance law)#

Under the same assumptions, the variance of \(\mathbf{Z}\) is

\[ \mathrm{Var}(\mathbf{Z}) = \mathrm{Var}( \mathbf{g}(\mathbf{X}) ) = \int_{\mathbb{R}^n} [\mathbf{g}(\mathbf{x}) - \boldsymbol\mu_z][\mathbf{g}(\mathbf{x}) - \boldsymbol\mu_z]^T f_{\mathbf{X}}(\mathbf{x}) d\mathbf{x} \]

where \(\boldsymbol\mu_z = \mathbb{E}( \mathbf{g}(\mathbf{X}) )\), which is described in the previous Theorem.

At this point, we proceed in the following part by showing how such expressions can be simplified, e.g., via a linearization of the non-linear transformation.

Attribution

This chapter was written by Sandra Verhagen and Lotfi Massarweh. Find out more here.