Report#
There are 9 graded tasks in this assignment, each one is worth 1 point; a score of 9 yields a grade of 10 for this group assignment.
These solutions are given for the De Bilt file.
Task 0.1 (not graded)#
Task 1.2#
How can we use the find_frequency function to find the frequencies of all periodic pattern in the data? In other words, how can we iteratively detect the frequencies in the data? ‘Removing’ periodic patterns is done by actually accounting for them in the functional model, through the A-matrix.
Write your answer, detailing the procedure, in a bulleted list that describes the procedure in your report.
Step 1: Run the function
Step 2: Find dominant frequency
Step 3: Add the column(s) (corresponding to the new parameter(s)) to the A matrix
Step 4: Repeat (iterate)
Task 1.3#
Include the resulting periodogram plot in your report.

Task 1.4 (not graded)#
Task 1.6#
Describe the steps taken in the previous task and the outcomes, and explain in your own words how the dominant frequencies were determined (list them in your report).
How did you decide when to stop? Which frequencies do you consider to be dominant? Include in your report the periodogram plot resulting from running find_frequency for the second time.
Is the final (detrended) time series (i.e. the final least-squares residuals) stationary? Explain. Include your answers in the report.
First run: we fit a linear model (with intercept and slope) to the data and find the dominant frequency of 1 cycle per year. This one clearly stands out in the PSD (periodogram) of the residuals, hence the present residual time series is not stationary, so we need to do this step again.
Second run: we fit a model with a linear trend and now also a yearly cycle. There is no (real, significant) dominant frequency left, the spectrum (PSD) is pretty flat.
The previous point confirms that the detrended series is indeed stationary (there are no significant periodic patterns remaining). The bottom line is that we need the linear trend, and the yearly cycle in the functional model, resulting in a stationary time series for the residuals.

Task 2.2#
Include the numerical parameter estimates in your report.
What can you say about the parameters, do the parameter estimates make sense?
What time of the year is the temperature highest (on average)?
Yes, the parameters’ estimates make sense. The average temperature (in the Netherlands) is about 10 degrees Celsius (the intercept, given in 0.1 degrees); the slope is close to 0 (0.1 degrees Celsius per 1000 days). Looking at the periodic signal we can see that the yearly parameter for \(\theta\) is 2.8, a little less than \(\pi\), which would correspond to half a period. Therefore, the cosine shifts to the left (by almost half a wavelength) and the temperature is lowest in winter and highest in summer, which would make sense physically for this location on the Northern hemisphere. Highest daily temperature occurs slightly after July 1st (somewhere mid-July).
Task 3.2#
Include the ACF plot in your report. What can you conclude from this ACF? Do the residuals originate from a white noise process?
The data (time series of residuals) are clearly autocorrelated (colored noise, definitely not white noise); the normalized autocovariance is significantly non-zero, up to about lag 10.
The residuals do not originate from a white noise process; they show some kind of ‘memory’ (over a duration of about 10 days).

Task 3.5#
We see hardly any autocorrelation in the residuals of the AR(1) model; apart from the lag 0 entry, which is 1 by definition, the other values are approximately zero (hence, with a little bit of approximation we can say that the remaining residuals seem to result from a white noise process), so the AR(1) model is (quite) a good fit to the residuals of Part 2. If the AR(1) model is not a good fit to the residuals we could try a higher order AR model (which is beyond the scope of the MUDE).
The estimated AR(1) coefficient \(\hat{\phi}_1\) is 0.81 (for De Bilt), which can also be observed from the ACF-plot with Task 3.1 (for lag = 1).

Task 4.1#
Include the numerical predicted temperature values in your report.
y_P1 = 24.387 [0.1 deg] y_P5 = 32.123 [0.1 deg]
Task 5.1#
Compute and make a plot of the daily temperature averaged over the 30 years time frame from 1991 to 2020. Include the resulting plot in your report.

Task 5.2#
Compute the residuals of the daily mean temperature time series (the input with Part 0) with respect to the ‘daily temperature averaged over the 30 years time frame from 1991 to 2020’ just computed in Task 5.1.
Finally compute and plot the normalized autocovariance function (ACF) of these residuals (include the ACF plot of the residuals in your report), and comment on this plot, e.g. comparing it with the ACF from Task 3.1.
The autocorrelation of the residuals using the ‘KNMI-functional-model’ (30 years average) is very similar to what we obtained with our functional model (linear trend plus cosine) in Task 3.1.

By
Caspar Jungbacker, Delft University of Technology. CC BY 4.0, more info on the Credits page of Workbook.