Modelling surface uplift due to groundwater variations#
Part 1: Exploring the data#
Within this assignment you will work with two types of data: InSAR data and GNSS data. The cell below will load the data and visualize the observed displacements time. In this task we use the package pandas, which is really useful for handling time series. We will learn how to use it later in the quarter; for now, you only need to recognize that it imports the data as a dataframe object, which we then convert into a numpy array using the code below.
import numpy as np
import scipy as sc
import pandas as pd
import matplotlib.pyplot as plt
from scipy.stats.distributions import chi2
from scipy.stats import norm
np.set_printoptions(precision=3)
import os
from urllib.request import urlretrieve
def findfile(fname):
    if not os.path.isfile(fname):
        print(f"Downloading {fname}...")
        urlretrieve('http://files.mude.citg.tudelft.nl/'+fname, fname)
findfile('gnss_observations.csv')
findfile('insar_observations.csv')
gnss = pd.read_csv('./gnss_observations.csv')
gnss_dates = pd.to_datetime(gnss['Dates'])
gnss_doy = (gnss['Day of Year']).to_numpy() 
gnss_obs = (gnss['Observations[m]']).to_numpy()
insar = pd.read_csv('./insar_observations.csv')
insar_dates = pd.to_datetime(insar['Dates'])
insar_doy = (insar['Day of Year']).to_numpy()
insar_obs = (insar['Observations[m]']).to_numpy()
Task 1.1
Once you have used the cell above to import the data, investigate the data sets using the code cell below. Then provide some relevant summary information in a Markdown cell.
The function below gives some examples of the quantitative and qualitative ways you could have looked at the data. Feel free to use it for your imported data. It is also always a good idea to try and plot the data for inspection. Describe aspects that are relevant to our problem. We use a dictionary to easily access the different data series using their names, which are entered as the dictionary keys (also not expected of you, but it’s hopefully fun to learn useful tricks).
def print_summary(data):
    '''Summarize an array with simple print statements.'''
    print('Minimum =     ', data.min())
    print('Maximum =     ', data.max())
    print('Mean =        ', data.mean())
    print('Std dev =     ', data.std())
    print('Shape =       ', data.shape)
    print('First value = ', data[0])
    print('Last value =  ', data[-1])
    print('\n')
          
print('Summary for array: ')
print('------------------------------------------------')
### YOUR CODE HERE ###
Part 2: Constructing the design matrix and observation vector#
As a starting point you assume that the surface at the observed location is subject to a linear increasing trend with a constant rate due to the injection by the powerplant, plus a seasonal effect driven by fluctuating precipitation, temperature and evapotranspiration. The observation equations that you will work with are:
where \(d\) is the observed displacement, and \(t\) is time (known), and the phase is assumed to be known: \(\phi= \frac{\pi}{2}\).
The model has 3 unknowns:
\(d_0\), the initial displacement at \(t_0\)
\(v\) the displacement velocity
\(A\) the amplitude of the seasonal variations
Your task is to construct the functional model that is defined as \(\mathbb{E}(Y) = \mathrm{A x}\).
Task 2.1
Construct the design matrix A and the observation vector \(Y\) (for both the InSAR and GNSS observations) and show the first 5 rows of them.
# Construct design matrix and observation vector for GNSS and InSAR: A_gnss, y_gnss
A_gnss = ### YOUR CODE HERE ###
y_gnss = ### YOUR CODE HERE ###
A_insar = ### YOUR CODE HERE ###
y_insar = ### YOUR CODE HERE ###
Task 2.2
Answer the following questions:
What is the dimension of the vector with the observables \(Y\)?
What are the unknowns of the functional model?
What is the redundancy for this model?
Hint: You may want to include these details in your report
Part 3: Set-up stochastic model#
We will use the Best Linear Unbiased Estimator (BLUE) to solve for the unknown parameters. Therefore we also need a stochastic model, which is defined as
where \(\Sigma_{Y}\) is the covariance matrix of the observable vector.
Task 3.1
Construct the covariance matrix and assume that
\(\sigma_\textrm{InSAR} = 5\) mm and \(\sigma_\textrm{GNSS} = 15\) mm
the observables are normally distributed
and that the observables are independent.
# Stochastic model for GNSS: Sigma_Y_gnss
### YOUR CODE HERE ###
# Stochastic model for insar: Sigma_Y_insar
### YOUR CODE HERE ###
Task 3.2
Answer the following questions:
What information is contained in the covariance matrix?
How do you implement the assumption that all observations are independent?
What is the dimension of \(\Sigma_{Y}\)?
How do you create \(\Sigma_{Y}\)?
Hint: You may want to include the information from your answers to questions 1-3 from this task in your report
Part 4: Apply best linear unbiased estimation (BLUE)#
Task 4.1
Write a function to apply BLUE in the cell below and use the function to estimate the unknowns for the model using the data. Print the values for the estimated parameters
def BLUE(A, y, Sigma_Y):
    """Calculate the Best Linear Unbiased Estimator
    
    Write a docstring here (an explanation of your function).
    
    Function to calculate the Best Linear Unbiased Estimator
    
    Input:
        A = A matrix (mxn)
        y = vector with observations (mx1)
        Sigma_Y = Variance covariance matrix of the observations (mxm)
    
    Output:
        xhat = vector with the estimates (nx1)
        Sigma_Xhat = variance-covariance matrix of the unknown parameters (nxn)
    """
    
    ### YOUR CODE HERE ###
    # Sigma_Xhat = ### YOUR CODE HERE ###
    # xhat = ### YOUR CODE HERE ###
    
    return xhat, Sigma_Xhat  
# Estimate the unknown parameters with GNSS and InSAR: xhat_gnss, Sigma_Xhat_gnss, xhat_insra, Sigma_Xhat_insar
### YOUR CODE HERE ###
Task 4.2
Do the values that you just estimated make sense? Explain, using quantitative results.
Part 5: Evaluate the precision#
Task 5.1:
What is the precision of the final estimates? Print the full covariance matrix of your estimates, and give an interpretation of the numbers in the covariance matrix.
# Precision of GNSS and InSAR
### YOUR CODE HERE ###
Part 6: Present and reflect on estimation results#
Task 6.1
Compute the modeled displacements (\(\hat{\mathrm{y}}\)), and corresponding residuals (\(\hat{\mathrm{\epsilon}}\)). Visualize the results in two graphs and add the confidence bounds (\(t\)-versus-displacement and \(t\)-versus-residuals).
Also create a histogram of the residuals where you plot the normal distribution (which you can estimate from the histogram) as well and report the mean and sigma of the residuals.
Note that this Task gives you a lot of freedom for how you can set up your figures. For example, you could write a function to create the plots, then use it twice, once for each dataset. This Task might also take a significant amount of time, so plan accordingly.
def plot_residual(### YOUR CODE HERE ###):
    #### YOUR CODE HERE ###
# _, _ = plot_residual(### YOUR CODE HERE ###)
Task 6.2:
Answer the following questions:
When you plot the residuals vs time, do you see any systematic effect? Give your interpretation for any discrepancy between observations and the model?
What is the mean value of the residuals?
What is the standard deviation of the residuals?
What can you conclude when you compare the histogram of the data with the computed normal distribution?
Did you use quantitative results in your answers?
Hint: You may want to include these details in your report
Part 7: Compare results between different datasets#
Task 7.1:
Compare the results you found for the InSAR observations and the GNSS observations in the questions above. Discuss the differences between the results, including the precision. Be quantitative and try to explain the differences based on the differences of both datasets!
Hint: You may want to include these details in your report
Part 8: How to deal with 2 datasets?#
Data acquisition and processing comes with a price. Note that in a real situation you would not look at a time series of only one point. For Sentinel-1 data you may have to pay, collecting GNSS data at different locations also costs money.
Task 8.1
What would you advise the authorities in terms of monitoring: use GNSS, InSAR or both? And how would you change you processing strategy if you have both GNSS and InSAR data at your disposal?
Hint: You may want to include these details in your report
By Lina Hagenah and Sandra Verhagen, Delft University of Technology. CC BY 4.0, more info on the Credits page of Workbook.