Notebook exercise: playing with the weights#
In this interactive notebook you will fit several models to a time series of height observations of a point on a glacier, to assess whether it is melting.
Click –> Live Code on the top right corner of this screen and then wait until all cells are executed.
Learning objectives:
apply weighted least-squares
discuss the impact of choosing different weight matrices
You have a time series of 8 measured heights and fit a model assuming a linear trend (constant velocity).
Times of observations, observed values and number of observations are given as the variables t
, y
and m
.
numpy
and matplotlib.pyplot
are abbreviated as np
and plt
, respectively.
# this will print all float arrays with 3 decimal places
import numpy as np
float_formatter = "{:.2f}".format
np.set_printoptions(formatter={'float_kind':float_formatter})
import matplotlib.pyplot as plt
import micropip
await micropip.install("ipywidgets")
import ipywidgets as widgets
from IPython.display import display
import operator
def check_answer(variable_name, expected, comparison = operator.eq):
output = widgets.Output()
button = widgets.Button(description="Check answer")
def _inner_check(button):
with output:
if comparison(globals()[variable_name], expected):
output.outputs = [{'name': 'stdout', 'text': 'Correct!', 'output_type': 'stream'}]
else:
output.outputs = [{'name': 'stdout', 'text': 'Incorrect!', 'output_type': 'stream'}]
button.on_click(_inner_check)
display(button, output)
# times of observation [months]
t = np.arange(8)
# observed heights [m]
y = [1.39, 1.26, 1.48, 4.03, 5.89, 5.14, 6.47, 7.64]
# number of observations
m = len(t)
print(f'The observation times [months] are\t: t = {t}')
print(f'The observed heights [meters] are\t: y = {y}')
print(f'The number of observations is\t\t: m = {m}')
A_ans = np.column_stack((np.ones(m), t))
W_1_ans = np.eye(m)
w=5
W_2_ans = np.eye(m)
W_2_ans[0,0] = w
W_2_ans[1,1] = w
W_2_ans[2,2] = w
W_3_ans = np.eye(m)
W_3_ans[3,3] = w
W_3_ans[4,4] = w
xhat_1_ans = np.linalg.solve(A_ans.T @ W_1_ans @ A_ans, A_ans.T @ W_1_ans @ y)
xhat_2_ans = np.linalg.solve(A_ans.T @ W_2_ans @ A_ans, A_ans.T @ W_2_ans @ y)
xhat_3_ans = np.linalg.solve(A_ans.T @ W_3_ans @ A_ans, A_ans.T @ W_3_ans @ y)
xhat_ans = np.column_stack((xhat_1_ans,xhat_2_ans,xhat_3_ans))
The observation times [months] are : t = [0 1 2 3 4 5 6 7]
The observed heights [meters] are : y = [1.39, 1.26, 1.48, 4.03, 5.89, 5.14, 6.47, 7.64]
The number of observations is : m = 8
Fill in the correct \(\mathrm{A}\)-matrix.
A = ?
check_answer("A", A_ans, np.array_equiv)
Solution
In case you did not get the right solution, use:
A = np.column_stack((np.ones(m), t))
Case 1: Define the weight matrix for \(\mathrm{W}=I_m\)
W_1 = ?
check_answer("W_1",np.eye(m), np.array_equiv)
Case 2: Define the weight matrix with the weight of first 3 observations is five times as large as the rest
W_2 = ?
check_answer("W_2",W_2_ans, np.array_equiv)
Case 3: Define the weight matrix with the weight of 4th and 5th observation is five times as large as the rest
W_3 = ?
check_answer("W_3",W_3_ans, np.array_equiv)
Solution
In case you did not get the right solution, you can use:
W_1 = np.eye(m)
w=5
W_2 = np.eye(m)
W_2[0,0] = w
W_2[1,1] = w
W_2[2,2] = w
W_3 = np.eye(m)
W_3[3,3] = w
W_3[4,4] = w
Compute the weighted least-squares estimate for all three cases.
# compute weighted least squares estimates for all cases
xhat_1 = ?
xhat_2 = ?
xhat_3 = ?
print('Estimated parameters:')
print('Case 1:\t', xhat_1)
print('Case 2:\t', xhat_2)
print('Case 3:\t', xhat_3)
Solution
In case you did not get the right solution, you can use:
xhat_1 = np.linalg.solve(A.T @ W_1 @ A, A.T @ W_1 @ y)
Similarly for case 2 and 3.
def plot_results():
yhat_1 = A_ans @ xhat_1_ans
yhat_2 = A_ans @ xhat_2_ans
yhat_3 = A_ans @ xhat_3_ans
plt.figure()
plt.rc('font', size=14)
plt.plot(t, y, 'kx', label='observations')
plt.plot(t, yhat_1, color='r', label='case 1')
plt.plot(t, yhat_2, color='b', label='case 2')
plt.plot(t, yhat_3, color='c', label='case 3')
plt.xlim(-0.2, (m-1)+0.2)
plt.xlabel('time')
plt.ylabel('observation')
plt.legend(loc='best')
plt.show()
Running the next cell will plot the fitted models. Note that the code with the function plot_results
is hidden.
plot_results()
Solution
The ordinary least-squares fit (\(W=I_m\)) is ‘in the middle’ as expected, since all observations get equal weight. For case 2, the first 3 observations get a larger weight, so the line is pulled ‘down’ to result in smaller errors for those 3 observations. For case 3, the opposite effect occurs, since the two observations that get the largest weight are also ‘above’ the red line.
You can play with the weight factor of case 2 and 3 to see how the impact changes, depending on its size.
Attribution
This chapter was written by Sandra Verhagen. Find out more here.