Time Series Stationarity

4.4. Time Series Stationarity#

Stochastic process#

In Section Components of time series it was pointed out that a time series \(Y(t) = [Y(t_1), Y(t_2), \ldots, Y(t_m)]^T\) is a sequence of random variables. Each element \(Y(t_i)\) is a random variable, with a probability density function denoted by \(f_{Y(t_i)}(y)\).

A time series results from observing a stochastic process. A stochastic process is a phenomenon, taking place as time goes by, which is subject to uncontrolled variability and associated uncertainty, and additional variability is typically involved in the observation of the process.

Would we observe it as a continuous function of time, then \(Y(t)\) is a random variable which depends on time, and hence the probability density function (PDF) of \(Y(t)\) would carry also time \(t\) as an argument, \(f_{Y(t)}(y,t)\), modelling the variability in the outcomes of \(Y(t)\).

https://github.com/TUDelft-MUDE/source-files/raw/main/file/randomprocess_threerealizations.svg — Fig. 4.24 Random process \(Y\left(t\right)\) of which three realizations \(y\left(t\right)\) are shown, as a function of time \(t\). The PDF \(f_{Y(t)}(y,t)\) is shown for time instants \(t_1\) and \(t_2\).#

A time series results from observing the stochastic process at discrete instants in time, resulting in a sequence of random variables \(Y(t_i)\) with associated the aforementioned PDF evaluated at times \(t_i\): \(f_{Y(t_i)}(y,t_i)\), with \(i=1,\ldots,m\).

The PDF carries time as an argument, meaning first that it depends on time and that the variability described by it may change as a \(\textit{function of time}\) (though soon we will assume that the process is stationary, and hence that it does not), and second that there may exist \(\textit{dependence in time}\), such that neighbouring elements of the time series, e.g. \(Y(t_2)\) and \(Y(t_5)\), are correlated, in which case we could say that the process has a kind of memory. This ‘time correlation’, very common in practice, must be properly taking into account when we estimate (predict) future values of the time series, and this is the goal of the remaining sections of this chapter on Time Series Analysis.

In the sequel we focus on the noise-component of the time series. We assume that the signal-part has been appropriately taken care of through estimation of the components in the functional model (such as trend and seasonality). We work with time series \(S = \hat{\epsilon} = Y - A \hat{X}\), hence the residuals which remain after estimation of the components of the functional model, and ideally have zero mean.

Stationarity#

Definition

A stationary time series \(S(t)\) is based on an underlying stochastic process of which the statistical properties do not depend on the time at which it is observed.

This means that parameters such as mean and (co)variance should remain constant over time and not follow any trend, seasonality or irregularity:

Mean of the process is time-independent (constant, and often zero)

\[\mathbb{E}(S(t))=\mathbb{E}(S_t)=\mu\]

Covariance of the process is independent of \(t\) for each time shift \(\tau\) (so only a function of the difference in time \(\tau\) and not of absolute time t):

\[ Cov(S_t,S_{t-\tau})= Cov(S_t,S_{t+\tau}) =\mathbb{E}((S_t-\mu)(S_{t-\tau}-\mu))=c_\tau \]

The variance (i.e., covariance \(\tau=0\)) is then also constant with respect to time :

\[ Var(S_t)=\mathbb{E}((S_t-\mu)^2)=c_0=\sigma^2 \]

Notice that we have introduced the new notation \(S_t\) to denote a stationary time series. The time series \(Y_t\) is then potentially a non-stationary time series (e.g. with a trend and/or seasonal effect).

The white noise stochastic model introduced here is stationary (Section 4.2).

Why stationary time series?#

Stationarity is important if we want to use a time series for forecasting (predicting future behaviour), which is not possible if the statistical properties change over time.

In practice, we may in fact be interested in for instance the trend and seasonality of a time series. Also, many real-world time series are of course non-stationary. Therefore the approach is to first “stationarize” the time series (e.g., remove the trend), use this stationary time series to predict future states based on the statistical properties (stochastic process), and then apply a back-transformation to account for the non-stationarity (e.g., add back the trend).

How to “stationarize” a time series?#

There are several ways to make a time series stationary. In this course we will focus on detrending the data using least-squares fit.

Least-squares fit#

If we can express the time series \(Y=[Y_1, ..., Y_m]^T\) with a linear model of observation equations as \(Y = \mathrm{Ax} + \epsilon\), we can apply best linear unbiased estimation to estimate the parameters \(\mathrm{x}\) that describe e.g. the trend and seasonality:

\[ \hat{X}=(\mathrm{A}^T\Sigma_{Y}^{-1}\mathrm{A})^{-1}\mathrm{A}^T\Sigma_{Y}^{-1}Y \]

A detrended time series is obtained in the form of the residuals

\[ \hat{\epsilon} = Y - \mathrm{A}\hat{X} \]

The detrended \(\hat{\epsilon}\) is assumed to be stationary for further stochastic analysis. This is also an admissible transformation because \(Y\) can uniquely be reconstructed as \(Y=\mathrm{A}\hat{X}+\hat{\epsilon}\).

Let us take a look into an example:

https://github.com/TUDelft-MUDE/source-files/raw/main/file/least_squares.png — Fig. 4.25 Example of a time series (right graph) with linear and seasonal trend. The residuals (= stationary time series) after applying BLUE are shown on the left.#

In the example above, for each observation \(Y_i = y_0+ rt_i+a\cos(2\pi f_1t_i)+b \sin(2\pi f_1t_i) +\epsilon_i\), where \(a\) and \(b\) describe the seasonality and \(y_0\) and \(r\) the trend. The time series is then:

\[\begin{split} \begin{bmatrix} Y_1 \\ Y_2 \\ \vdots \\ Y_m \end{bmatrix} = \begin{bmatrix} 1&t_1&\cos(2\pi f_1 t_1) & \sin(2\pi f_1 t_1) \\ 1&t_2&\cos(2\pi f_1 t_2) & \sin(2\pi f_1 t_2) \\ \vdots & \vdots & \vdots & \vdots \\ 1&t_m&\cos(2\pi f_1 t_m) & \sin(2\pi f_1 t_m) \end{bmatrix} \begin{bmatrix} y_0 \\ r \\ a \\ b \end{bmatrix} + \begin{bmatrix} \epsilon(t_1) \\ \epsilon(t_2) \\ \vdots \\ \epsilon(t_m) \end{bmatrix} \end{split}\]

The time series of the residuals \(\hat{\epsilon} = Y-A\hat{X}\) (left graph) is indeed a stationary time series.

Question Stationary Time Series

Which of the four options is a stationary time series?

https://github.com/TUDelft-MUDE/source-files/raw/main/file/stat_question.png — Fig. 4.26 Example of a stationary time series.#

Solution

The time series in the second panel is stationary. The mean and variance are constant over time.

Other ways to make a time series stationary#

When functional model specification is not straightforward, other methods can be used to make a time series stationary. Two common methods are single differencing and moving average. Single differencing of \(Y=[Y_1,...,Y_m]^T\) creates a time series \(\Delta Y_t=Y_t - Y_{t-1}\) (long term trends are removed in this way). Another way to create an (almost) stationary time series is by taking the moving average of the time series, where we apply a moving average of \(k\) observations to the time series \(Y\) to create a new time series \(\bar{Y}_t = \frac{1}{k}\sum_{i=1}^{k}Y_{t-i}\) (short term variations are removed in this way), and then take the difference between the original time series and the moving average to obtain a (nearly) stationary time series \(\Delta Y_t = Y_t - \bar{Y}_t\). The moving average is elaborated on in the Appendix.

Both these methods do not require a model specification. So in cases where the functional model is not known, these methods can be used to make the time series stationary.

… and then what?#

We have seen different ways of obtaining a stationary time series from the original time series. The reason is that in order to make predictions (forecasting future values, beyond the time of the last observation in the time series) we need to account for both the (functional) signal-of-interest and the (stochastic) noise. Estimating the signal-of-interest was covered in the previous section. In the next sections we will show how the noise can be modelled as a stochastic process. Given a time series \(Y=\mathrm{Ax}+\epsilon\), the workflow for prediction is as follows:

Estimate the signal-of-interest \(\hat{X}=(\mathrm{A}^T\Sigma_{Y}^{-1}\mathrm{A})^{-1}\mathrm{A}^T\Sigma_{Y}^{-1}Y\) (Section Modelling and estimation).
Model the noise using for instance an Autoregressive (AR) model, using the stationary time series \(S:=\hat{\epsilon}=Y-\mathrm{A}\hat{X}\) (Section AR).
Predict the signal-of-interest: \(\hat{Y}_{signal}=\mathrm{A}_p\hat{X}\), where \(\mathrm{A}_p\) is the design matrix describing the functional relationship between the future values \(Y_p\) and \(\mathrm{x}\) (Section Forecasting).
Predict the stochastic noise \(\hat{\epsilon}_p\) based on the AR model.
Predict future values of the time series: \(\hat{Y}_p=\mathrm{A}_p\hat{X}+\hat{\epsilon}_p\) (Section Forecasting).

Resulting in estimates \(\hat{Y}_p\) of the time series at future times \(t_p\), beyond the time of the last observation \(t_m\) in the time series.

Attribution

This chapter was written by Alireza Amiri-Simkooei, Christiaan Tiberius and Sandra Verhagen. Find out more here.