Time series components

4.1. Time series components#

A time series is a discrete time sequence of data points indexed in time which can be used to study a phenomenon. It is a record of the data collected at different points in time, and it consists of discrete time samples of typically a continuous-time phenomenon in reality. The data are usually collected at fixed time intervals rather than just recording them intermittently or irregularly. The fixed interval \(\Delta t\), in the time domain, is defined as ‘sampling interval’ and, in the frequency domain, it is defined as ‘sampling rate’ or ‘sampling frequency’ \(f_s\), expressed for example in Hz:

\[ \Delta t = \frac{1}{f_s} \]

A real-valued time series is denoted as

\[Y(t) = [Y(t_1), Y(t_2), \ldots{}, Y(t_m)]^T\]

The \(Y(t_i)\) are random variables, since the data is affected by noise.

The time instants, also defined as epochs, are \(t_i = i \Delta t\), with \(i = 1,\ldots,m\), indicating that the samples are equally spaced in time intervals of \(\Delta t\). Assuming a unit time interval for convenience of explanation in this chapter (i.e., \(\Delta t=1\)), then \(t_i = i\) and we can write the time series as

\[Y(t) = [Y(1), Y(2), \ldots{}, Y(m)]^T = [Y_1, Y_2, \ldots{}, Y_m]^T\]

https://github.com/TUDelft-MUDE/source-files/raw/main/file/time_series.png — Fig. 4.19 Example of time series with equally spaced time interval \(\Delta t\)#

A time series can commonly be decomposed as follows:

\[Y(t) = tr(t) + s(t) + o(t) + b(t) + \epsilon(t)\]

where we distinguish the following components:

\(tr(t)\) = trend, provides general behavior and change of the process.
\(s(t)\) = seasonality, shows regular seasonal (cyclic) variations.
\(o(t)\) = offset, is a discontinuity (or jump) in the data.
\(b(t)\) = irregularities and outliers (also referred to as biases), due to unexpected reasons. Irregularities will not be considered in this book.
\(\epsilon(t)\) = noise process, can be white or colored noise.

Noting that not all components may be present in a time series at hand. Components 1-4 are part of the functional model of the time series, the last component is covered by the stochastic model.

Trend#

The trend is the general pattern of the time series and shows its long-term changes. The trend can be linear, however higher order polynomials (and other functions) are also possible.

Fig. 4.20 shows a positive trend (red line) of about \(3.5\) mm/year, which in this case indicates global sea level rise. This however needs to be further investigated and tested statistically (see testing and also Modelling and Estimation).

Trend analysis expresses the changes of the variable of interest with respect to independent variable time \(t\). Different types of trend are possible and for now we will mainly focus on a linear trend, i.e. the time-dependent variable \(Y(t)\) changes at a (constant) linear rate over time: \(Y_t = y_0 + r t + \epsilon(t)\). Other trends are however also possible, for example, quadratic, which includes \(c t^2\), or log linear \(\log(Y_t) = y_0 + r t + \epsilon(t)\).

Seasonality#

Seasonal variations explain regular fluctuations in a certain period of time (e.g. a year), usually caused by climate and weather conditions (e.g. temperature, rainfall), cycles of seasons, customs, traditional habits, weekends, or holidays. For example, the weekly signal is usually evident in the volume of people engaged in shopping (likely more people prefer going shopping in the weekends).

From Fig. 4.20 it is also possible to see the seasonal variations: in fact sea levels are higher in summer and lower in winter. The annual warming/cooling cycle is the main contributor to these seasonal variations.

Regular seasonal variations in a time series might be handled by a sinusoidal model with one or more sinusoids with frequencies that may be known or unknown depending on the context. In Fig. 4.20, cyclical behavior with a period of 1 year can be observed. A harmonic model for seasonal variation can be of the following two equivalent forms (using that \(\cos(u+v)= \cos u \cos v - \sin u \sin v\)):

\[\begin{split} \begin{align*} Y(t) &= \sum_{i=1} ^p A_i \cos(2\pi f_i t + \theta_i) + \epsilon(t)\\ &= \sum_{i=1} ^p \left(a_i \cos(2\pi f_i t) + b_i \sin(2\pi f_i t) \right)+ \epsilon(t) \end{align*} \end{split}\]

With the coefficients \(a_i = A_i\cos\theta_i\) and \(b_i=-A_i\sin\theta_i\), and where \(f_i\) is the frequency of the \(i\)-th harmonic of seasonal variation and is fixed or determined by Spectral Analysis. To be more specific, we can use the PSD to determine the unknown frequencies.

Once the \(f_i\) are set, the coefficients \(a_i \) and \(b_i\) can be determined using the least-squares method, since the equation is linear in \(a_i\) and \(b_i\). From this the original sinusoids can be obtained using:

\[ A_i = \sqrt{a_i^2 + b_i^2}, \hspace{1cm} \theta_i = \arctan\left(-\frac{b_i}{a_i}\right), \hspace{1cm} i = 1, \ldots{}, p \]

Note

This transformation is necessary to formulate the seasonal component without an explicit phase. Using regular estimation methods, we cannot linearly estimate the phase of the sinusoidal function. However by transforming the sinusoidal function into a linear combination of a sine and cosine function, we can (indirectly) estimate the phase \(\theta_i\) of the seasonal component.

Worked example - seasonal signal

Show that the time series

\[\mathbb{E}(Y(t))=A \cos(2\pi f_1 t + \theta)\]

with given \(2\pi f_1\), can be rewritten as

\[\mathbb{E}(Y(t))=a \cos(2\pi f_1 t) + b \sin(2\pi f_1 t)\]

and derive the formulation of \(A\) and \(\theta\).

Hint: you might need to know trigonometric identity \(\cos(u+v)=\cos(u)\cos(v)-\sin(u)\sin(v)\)

Solution

Using the trigonometric identity to rewrite:

\( Y(t)=A \cos(2\pi f_1 t + \theta) = A (\cos(2\pi f_1 t)\cos(\theta)-\sin(2\pi f_1 t)\sin(\theta)) \)

Retrieving the functions for a and b

\( a = A \cos(\theta) \hspace{1cm} b = -A \sin(\theta)\)

Squaring both functions in order to get rid of the sin and cos

\( a^2 = A^2 \cos^2(\theta) \hspace{1cm} b^2 = A^2 \sin^2(\theta) \)

Adding both functions together

\( a^2 + b^2 = A^2 (\cos^2(\theta) + \sin^2(\theta)) \)

Using this property to simplify:

\( \cos^2(\theta) + \sin^2(\theta) = 1 \)

\( a^2 + b^2 = A^2 \)

Take square root to find A

\( \sqrt{a^2 + b^2} = A \)

For \(\theta\) we rewrite the second function

\( a = A \cos(\theta) \hspace{1cm} -b = A \sin(\theta) \),

\( \frac{-b}{a} = \frac{\sin \left( \theta \right) }{\cos \left( \theta \right)} = \rm{tan} \left( \theta \right) \),

\( \theta = \arctan(\frac{-b}{a}) \)

This video includes the solution to this exercise.

Offset (jump)#

Offsets are sudden changes or shifts in time series. There are different underlying causes why we encounter offsets in time series.

https://github.com/TUDelft-MUDE/source-files/raw/main/file/offset.png — Fig. 4.21 Example of time series with two offsets.#

As a deterministic sudden change, offsets can be handled by a step function such as a Heaviside step function with an epoch (time instant) that can be known or unknown (to be detected) depending on the time series.

In this case the time series is written as:

\[ Y(t) = \sum_{k=1}^q o_k u_k(t)+\epsilon(t)\]

where \(q\) is the number of offsets (in Fig. 4.21 there are two offsets, hence \(q=2\)) and each of them is expressed as a Heaviside step function

\[\begin{split}u_k(t) = \left\{ \begin{array}{ll} 0 & \text{if} \hspace{0.3cm} t<t_k \\ 1 & \text{if} \hspace{0.3cm} t\geq t_k \\ \end{array} \right. \end{split}\]

Once the time instant \(t_k\) of the offset is known, the amplitude can be estimated using least-squares.

Noise#

Noise refers to random fluctuations or variations in the time series about its typical pattern. In general we can talk about white and colored noise in time series analysis. Until now we only considered normally distributed, zero mean white noise , i.e., \(\epsilon(t) \sim N(0, \sigma^2_{\epsilon})\), and we didn’t have to deal with time correlation, but in this chapter we will also consider other types of noise and time correlation.

Example - time series consisting of a trend, annual cycle (seasonality), an offset and pure random noise (white noise)

It can be written as

\[Y(t) = y_0 + rt + a \cos(2\pi f_1 t) + b \sin(2\pi f_1 t) + o u_k(t) + \epsilon(t)\]

where

\(y_0\) is the intercept at \(t=0\) (e.g. in mm).
\(r\) is the (constant) rate (e.g. in mm/year).
\(a\) and \(b\) are the coefficients of the cycle, (e.g. annual cycle).
\(f_1\) is the frequency of the seasonal component (e.g. 1 cycle/year).
\(o\) is the offset starting at time \(t_k\).
\(u_k(t)\) is the Heaviside step function.
\(\epsilon(t)\) is the i.i.d. random zero mean Gaussian noise, i.e. \(\epsilon(t) \sim N(0, \sigma_{\epsilon}^2)\).

Attribution

This chapter was written by Alireza Amiri-Simkooei, Christiaan Tiberius and Sandra Verhagen. Find out more here.