4.5. Autocovariance function#
Before we can look into the modelling of a stochastic process for instance using an Autoregressive (AR) model, we first need to introduce the autocovariance function (ACF) for a stationary time series, and describe the relationship between the ACF and the power spectral density (PSD).
As in Observation theory, the variance component is often determined based on the precision of an observable (at a given epoch), and the covariance components quantitatively indicate the statistical dependence between observables. In this case, dependence is inherently introduced by the physical process that produces the time series, and in fact our time series methods seek to (mathematically) account for this.
Autocovariance and autocorrelation#
Let us assume an arbitrary stationary (discrete) time series, \(S=[S_1,S_2,...,S_m]^T\), with mean \(\mathbb{E}(S_{i})=\mu\) and variance \(Var(S_{i})=\sigma^2\) for \(i=1,\ldots,m\). Remember that stationarity implies that the statistical properties of the time series do not depend on the time at which it is observed, i.e. expectation and variance are constant over time.
The formal (or: theoretical) autocovariance is defined as
We have that \(Cov(S_{t+\tau}, S_t) =Cov(S_t, S_{t-\tau})\).
The autocovariance depends on \(\tau\), which is referred to as lag; it is the difference in time between \(t+\tau\) and \(t\), and equally the difference between \(t\) and \(t-\tau\). In the sequel we use \(\tau\) both to indicate the time shift in unit of time (e.g. seconds) and the corresponding discrete time index, as at the start of this chapter it was for convenience assumed that \(\Delta t = 1\), and hence \(t_i = i\).
Show that the covariance can be written as:
Solution
Prove that \(Cov(S_t, S_{t-\tau}) =Cov(S_{t+\tau}, S_t)\):
Solution
From the definition of covariance, we know that
Hence, we have that
Due to the stationarity of the time series (invariant against time shift), we have that
Therefore, we have that
and \(c_{\tau} = c_{-\tau}\).
The formal autocorrelation is defined as
and \(r_{\tau} = r_{-\tau}\).
Note
When we have a zero-mean time series, \(\mu=0\), it follows that \(c_{\tau}=r_{\tau}\)
Empirical autocovariance#
The autocovariance function of a time series is not known beforehand, and hence needs to be estimated based on the actual observed values. The least-squares method or maximum likelihood method can be used to estimate this empirical (or sample) autocovariance function of a time series. Let us see how!
Least-squares estimation
For a given stationary time series \(S = [S_1,S_2,...,S_m]^T\), the least-squares estimator of the autocovariance function is given by
and is an unbiased estimator \(E(\hat{C}_{\tau}) = c_\tau\).
The least-squares estimator of autocorrelation (also called empirical autocorrelation function) is then
Maximum likelihood estimation
The maximum likelihood estimator of autocovariance is given by
Note that this is a biased estimator, \(\mathbb{E}(\hat{C}_{\tau})\neq c_{\tau}\).
Similarly, the maximum likelihood estimate of the autocorrelation follows as:
Note practically, for large \(m\), and not too large \(\tau\), there is little difference in using \(\frac{1}{m-\tau}\) and \(\frac{1}{m}\).
Note
Here we use capitals for \(\hat{C}_{\tau}\) and \(\hat{R}_{\tau}\) since estimators are always a function of the random observables \(S_t\).
Note
If mean \(\mu\) is not known, it can be replaced in practice by the empirical mean in the above expressions for the empirical autocovariance function.
Covariance matrix based on autocovariance#
The structure of a covariance matrix for a stationary time series is purely symmetric (mirrored in the diagonal) and it looks like
There are \(m\) (co)variance components - one variance component, \(\sigma^2 = c_0\), and \(m-1\) covariance components, \(c_i\) for \(i=1,\ldots,m-1\).
Normalized ACF#
The normalized autocovariance estimator can directly be obtained from the autocovariance estimator as
and \(|\hat{\rho}_{\tau}| \leq 1\) for all \(\tau\) (using the biased ACF \(\hat{C}_{\tau}\), which is the default implemented in Python)
Note
The estimated normalized autocovariance is the same as the time dependent Pearson correlation coefficient.
In literature \(\hat{\rho}_{\tau}\) is sometimes referred to as the autocorrelation (coefficient) function.
The variance of the normalized ACF can, for large \(m\), be approximated as
for processes with decaying autocorrelation function (Box et al., 2015) (Eq. (2.1.15)). The variance can be used, only for larger values of \(\tau\), as a check whether \(\rho_{\tau}\) is effectively zero beyond a certain lag. In practice one can substitute \(\hat{\rho}\) for \(\rho\).
If \(m\) is sufficiently large, \(\hat{\rho}_{\tau}\) is approximately normally distributed:
Though it is possible to compute \(\hat{\rho}_{\tau}\) for up to \(\tau=m-1\), it is recommended in practice to limit \(\tau\) for instance to at most 10 percent of \(m\).
Worked example#
Let us consider a time series of \(m=100\) observations, such that
and \(\hat{\rho}_i \approx 0\) for \(i>1\).
We know that
with \(\rho_1\) unknown and
We will now apply a test whether the estimated autocorrelation is significant. The null hypothesis assumes there is no correlation:
\(\mathcal{H}_0\): \(\rho_1=0\)
\(\mathcal{H}_a\): \(\rho_1 \neq 0\)
Since we know the distribution of \(\hat{\rho}_1\), a suitable test statistic would be:
where we would reject \(\mathcal{H}_0\) if \(|T|>k_{\alpha}\). With a false alarm rate of \(\alpha = 0.01\), we find that the critical value can be obtained from the table of the standard normal distribution. Note that we have a 2-sided critical region, hence we need to look up the value for \(0.5\alpha\).
In this example, we obtain:
and hence the null hypothesis is rejected, implying that the autocorrelation at lag \(\tau = 1\) is significant.
A zero-mean stationary noise process consists of \(m=5\) observations:
What is the least-squares estimate of the normalized ACF at \(\tau=1\); so compute \(\hat{\rho}_{1}\)?
Solution
The normalized autocovariance function (ACF) can be estimated from the auto-covariance function as:
where the least-squares estimate of auto-covariance function is:
For our application we have \(\mu_y=0\), as we deal with a zero-mean process. We have to compute \(\hat{\sigma}(0)\) and \(\hat{\sigma}(1)\) given as:
And
Giving
Keep in mind that Python by default uses the maximum likelihood estimator for \(\hat{C}_{\tau}\), i.e. with \(\frac{1}{m}\) instead of \(\frac{1}{m-\tau}\).
Attribution
This chapter was written by Alireza Amiri-Simkooei, Christiaan Tiberius and Sandra Verhagen. Find out more here.