import matplotlib
if not hasattr(matplotlib.RcParams, "_get"):
    matplotlib.RcParams._get = dict.get

4.6. Autoregressive process#

The main goal is to introduce the AutoRegressive (AR) model to describe a stationary stochastic process. Hence the AR model can be applied on time series where e.g. trend and seasonality are not present / removed, and only noise remains, or after applying other methods to obtain a stationary time series. The detrended, noise-only process (time series) is denoted by \(S_t\).

Process definition#

In an AR model, we forecast the random variable of interest using a linear combination of its past values. A zero mean AR process of order \(p\) can be written as follows:

\[S_t = \overbrace{\phi_1S_{t-1}+...+\phi_pS_{t-p}}^{\text{AR process}} + \epsilon_t \]

or as

\[S_t = \sum_{i=1}^p \phi_iS_{t-i}+\epsilon_t\]

Each random variable \(S_t\) is made up of a random error \(\epsilon_t\) at that epoch, and a linear combination of past random variables \(S_{t-1}\) through \(S_{t-p}\). The errors \(\epsilon_t\) are an uncorrelated purely random noise process, known as white noise. We note the process should still be stationary, satisfying

\[\mathbb{E}(S_t)=0, \hspace{20px} \mathbb{D}(S_t)=\sigma^2,\quad \forall t\]

This indicates that parts of the total variability of the process come from past epochs, and only a (small) portion belongs to the noise of that epoch (denoted as \(\epsilon_t\)). Noise process \(S_t\) has a kind of memory.

First-order AR(1) process#

We will just focus on explaining \(p=1\), i.e. the AR(1) process. A zero-mean first order autoregressive process can be written as follows

\[ S_t = \phi S_{t-1}+\epsilon_t, \hspace{20px} -1\leq\phi<1, \hspace{20px} t=2,...,m \]

where \(\epsilon_t\) is an i.i.d. noise process, e.g. distributed as \(\epsilon_t\sim N(0,\sigma_{\epsilon}^2)\). See later the definition of \(\sigma_{\epsilon}^2\).

Exercise

In a zero-mean first order autoregressive process, abbreviated as AR(1), we have \(m=3\) observations, \(\phi=0.8\), and the generated white noise error samples are \(\epsilon = [\epsilon_1,\, \epsilon_2,\, \epsilon_3]^T=[1,\, 2,\, -1]^T\). What is the generated/realized AR(1) process \(s = [s_1,\, s_2,\, s_3]^T\)?

a. \(s = \begin{bmatrix}1 & 2.8 & 1.24\end{bmatrix}^T\)
b. \(s = \begin{bmatrix} 0 & 2 & 0.6 \end{bmatrix}^T\)
c. \(s = \begin{bmatrix} 1 & 2 & -1 \end{bmatrix}^T\)

Formulation#

Initializing \(S_1=\epsilon_1\), with \(\mathbb{E}(S_1)=\mathbb{E}(\epsilon_1)=0\) and \(\mathbb{D}(S_1)=\mathbb{D}(\epsilon_1)=\sigma^2\). Following this, multiple applications of the above “autoregressive” formula (\(S_t = \phi S_{t-1} + \epsilon_t\)) gives:

\[\begin{split} \begin{align*} S_1&=\epsilon_1\\ S_2&=\phi S_1+\epsilon_2\\ S_3 &= \phi S_2+\epsilon_3 = \phi^2S_1+\phi \epsilon_2+\epsilon_3\\ &\vdots\\ S_m &= \phi S_{m-1} + \epsilon_m = \phi^{m-1}S_1+\phi^{m-2}\epsilon_2+...+\phi \epsilon_{m-1}+\epsilon_m \end{align*} \end{split}\]

of which we still have (in order to impose the stationarity):

\[\mathbb{E}(S_t)=0 \hspace{5px}\text{and}\hspace{5px} \mathbb{D}(S_t)=\sigma^2, \hspace{10px} t=1,...,m\]

All the error components, \(\epsilon_t\), are uncorrelated such that \(Cov(\epsilon_t,\epsilon_{t+\tau})=0\) if \(\tau \neq 0\), and with variance \(\sigma_{\epsilon}^2\) which still needs to be determined.

Autocovariance#

The mean of the process is zero and, therefore:

\[\mathbb{E}(S_t) = \mathbb{E}(\phi S_{t-1}+\epsilon_t) = \phi\mathbb{E}(S_{t-1})+\mathbb{E}(\epsilon_t) = 0\]

which is met if \(\epsilon_t\) is zero mean for all \(t\) as already modelled.

The variance of the process should remain constant as:

\[\mathbb{D}(S_t) = \mathbb{D}(\phi S_{t-1} +\epsilon_t) \Leftrightarrow \sigma^2=\phi^2\sigma^2+\sigma_{\epsilon}^2, \hspace{10px} t\geq 2\]

resulting in

\[\sigma_{\epsilon}^2 = \sigma^2 (1-\phi^2)\]

indicating that \(\sigma_{\epsilon}^2\) is smaller than \(\sigma^2\).

The autocovariance (covariance between \(S_t\) and \(S_{t+\tau}\)) is

\[\begin{split} \begin{align*} c_{\tau}&=\mathbb{E}(S_t S_{t+\tau})-\mu^2 =\mathbb{E}(S_t S_{t+\tau})\\ &= \mathbb{E}(S_t(\phi^\tau S_t + \phi^{\tau-1} \epsilon_{t+1}+...)) = \phi^\tau\mathbb{E}(S_t^2)=\sigma^2\phi^\tau \end{align*} \end{split}\]

In the derivation above we used that:

\[ \begin{align*} S_{t+\tau}=\phi^\tau S_t + \phi^{\tau-1} \epsilon_{t+1}+...+\epsilon_{t+\tau} \end{align*} \]

and the fact that \(S_t\) and \(\epsilon_{t+\tau}\) are uncorrelated for \(\tau > 0\).

Model structure of AR(1)#

\[\begin{split}\mathbb{E}(S) = \mathbb{E}\begin{bmatrix}S_1\\ S_2\\ \vdots\\ S_m\end{bmatrix} = \begin{bmatrix}0\\ 0\\ \vdots\\ 0\end{bmatrix}, \hspace{15px} \mathbb{D}(S)=\Sigma_{S}=\sigma^2 \begin{bmatrix}1&\phi&...&\phi^{m-1}\\ \phi&1&...&\phi^{m-2}\\ \vdots&\vdots&\ddots&\vdots\\ \phi^{m-1}&\phi^{m-2}&...&1\end{bmatrix}\end{split}\]
  • Autocovariance function \(\implies\) \(c_{\tau}=\sigma^2\phi^\tau.\)

  • Normalized autocovariance function (ACF) \(\implies\) \(\rho_\tau=c_{\tau}/c_0=\phi^\tau.\)

  • Larger value of \(\phi\) indicates a longer-memory random process.

  • If \(\phi=0\), this is called purely random process (white noise).

  • ACF is even, \(c_{\tau}=c_{-\tau}=c_{|\tau|}\) and so is \(\rho_{\tau}=\rho_{-\tau}=\rho_{|\tau|}.\)

  • variance matrix \(\Sigma_S\) is fully populated.

Later in this section we will see how the coefficient \(\phi\) can be estimated.

Simulated example#

If you have run the python code on this page, an interactive plot will be displayed below. You can change the value of \(\phi\) and the number of observations \(m\) to see how the AR(1) process changes. At the start, the process is initialized with \(\phi = 0.8\). Try moving the slider and see the response of the ACF; pay special attention when \(\phi=0\) and when \(\phi\) becomes negative.

Lastly, focus on the case where \(\phi=1\) and \(\phi=-1\). What do you observe? You will notice that the function will “explode”. This makes intuitive sense, since the effect of the previous epoch is not dampened, but rather amplified. This also means that the process is not stationary anymore. So, the AR(1) process is stationary only if \(|\phi|<1\).

import numpy as np
import matplotlib.pyplot as plt
from statsmodels.graphics.tsaplots import plot_acf
import ipywidgets as widgets
from ipywidgets import interact


def acfplot(phi=0.8):
    # Parameters for the AR(1) process

    sigma = 1          # Standard deviation of the noise
    n = 500            # Length of the time series

    # Initialize the process
    np.random.seed(42)  # For reproducibility
    X = np.zeros(n)
    X[0] = np.random.normal(0, sigma)  # Initial value

    # Generate the AR(1) process and noise series
    noise = np.random.normal(0, sigma, n)  # Pre-generate noise for the second plot
    for t in range(1, n):
        X[t] = phi * X[t-1] + noise[t]  # Use pre-generated noise

    # Create the 2x1 subplot
    fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(12, 6), sharex=False)

    # Plot the AR(1) process in the first subplot
    ax1.plot(X, label=f"AR(1) Process with φ={round(phi, 2)}")
    ax1.set_ylabel("Value")
    ax1.set_title("Simulated AR(1) Process")
    ax1.set_xlabel("Time")
    ax1.set_ylim(-4, 4)
    ax1.legend()
    ax1.grid(True)

    # Plot the white noise in the second subplot
    lags = 20
    plot_acf(X, ax=ax2, lags=lags, title="ACF of White Noise")
    ax2.set_xlabel("Lag")
    ax2.set_ylabel("ACF Value")
    ax2.set_title("ACF of AR(1) Process")
    ax2.set_xlim(-0.5, lags)
    ax2.grid(True)

    
    # Display the plot
    plt.tight_layout()
    plt.show()


interact(acfplot, phi=widgets.FloatSlider(value=0.8, min=-1, max=1.0, step=0.1, description='Phi:'));

Estimation of coefficients of AR process#

If the value of order \(p\) and of the AR(\(p\)) process coefficients are not known, the question is: how can we estimate the coefficients \(\phi_1,...,\phi_p\)?

Here, we only elaborate on AR(1) using least-squares to estimate \(\phi_1.\) The method can be generalized to estimate the parameters of an AR(\(p\)) process (See Exercise AR(p)).

Example: Parameter estimation of AR(1)

The AR(1) process is of the form

\[S_t=\phi_1 S_{t-1}+\epsilon_t\]

In order to estimate the \(\phi_1\) we can set up the following linear model of observation equations (starting from \(t=2\)):

\[\begin{split}\begin{bmatrix}S_2 \\ S_3 \\ \vdots \\ S_m \end{bmatrix} = \begin{bmatrix}S_1 \\S_2 \\ \vdots\\ S_{m-1} \end{bmatrix}\begin{bmatrix}\phi_1 \end{bmatrix} + \begin{bmatrix}\epsilon_{2} \\ \epsilon_{3}\\ \vdots \\ \epsilon_{m} \end{bmatrix}\end{split}\]

and the \((m-1) \times 1\) \(A\)-matrix now consists of the elements \(S_1\) through \(S_{m-1}\) of the stationary time series \(S_t\). The least-squares estimator of \(\phi_1\) is given by:

\[\hat{\phi}_1=(\mathrm{A}^T\mathrm{A})^{-1}\mathrm{A}^TS\]

Where \(\mathrm{A}=\begin{bmatrix}S_1 & S_2 & \cdots & S_{m-1}\end{bmatrix}^T\) and \(S=\begin{bmatrix}S_2 & S_3 & \cdots & S_m\end{bmatrix}^T\).

It can be proved that for an AR(1) process, the least-squares estimator of \(\phi_1\) is approximately equal to \(\hat{\rho}_1\) which is the sample normalized autocovariance function at lag one, \(\tau = 1\):

\[\hat{\phi}_1 = \hat{\rho}_1 = \frac{\hat{C}_1}{\hat{C}_0}\]

where \(\hat{C}_1\) and \(\hat{C}_0\) are the values of the sample (or empirical) autocovariance function at lag one and lag zero, respectively (see ACF).

Attribution

This chapter was written by Alireza Amiri-Simkooei, Christiaan Tiberius and Sandra Verhagen. Find out more here.