Case 1: Wave impacts on a crest wall¶
What's the propagated uncertainty? *How large will the horizontal force be?*
In this project, you have chosen to work on the uncertainty of wave periods and wave heights in the Alboran sea to estimate the impacts on a crest wall: a concrete element installed on top of mound breakwater. You have observations from buoys of the significant wave height ($H$) and the peak wave period ($T$) each hour for several years. As you know, $H$ and $T$ are hydrodynamic variables relevant to estimate wave impacts on the structure. The maximum horizontal force (exceeded by 0.1% of incoming waves) can be estimated using the following equation (USACE, 2002).
$$ F_h = \left( A_1 + A_2 \frac{H}{A_c} \right) \rho g C_h L_{0p} $$where $A_1=-0.016$ and $A_2=0.025$ are coefficients that depend on the geometry of the structure, $A_c=3m$ is the elevation of the frontal berm of the structure, $\rho$ is the density of water, $g$ is the gravity acceleration, $C_h=2m$ is the crown wall height, and $L_{0p}=\frac{gT^2}{2\pi}$ is the wave length in deep waves. Thus, the previous equation is reduced to
$$ F_h = 255.4 H T^2 -490.4 T^2 $$The goal of this project is:
- Choose a reasonable distribution function for $H$ and $T$.
- Fit the chosen distributions to the observations of $H$ and $T$.
- Assuming $H$ and $T$ are independent, propagate their distributions to obtain the distribution of $F_h$.
- Analyze the distribution of $F_h$.
Importing packages¶
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from math import ceil, trunc
plt.rcParams.update({'font.size': 14})
1. Explore the data¶
First step in the analysis is exploring the data, visually and through its statistics.
# Import
_, H, T = np.genfromtxt('dataset_HT.csv', delimiter=",", unpack=True, skip_header=True)
# plot time series
fig, ax = plt.subplots(2, 1, figsize=(10, 7), layout = 'constrained')
ax[0].plot(H,'k')
ax[0].set_xlabel('Time')
ax[0].set_ylabel('Wave height, H (m)')
ax[0].grid()
ax[1].plot(T,'k')
ax[1].set_xlabel('Time')
ax[1].set_ylabel('Water period, T (s)')
ax[1].grid()
# Statistics for H
print(stats.describe(H))
# Statistics for d
print(stats.describe(T))
Task 1:
Describe the data based on the previous statistics:
2. Empirical distribution functions¶
Now, we are going to compute and plot the empirical PDF and CDF for each variable. Note that you have the pseudo-code for the empirical CDF in the reader.
Task 2:
Define a function to compute the empirical CDF. Plot your empirical PDF and CDF.
def ecdf(YOUR_INPUTS):
#your code
return YOUR_OUTPUT
# Your plot here
Task 3:
Based on the results of Task 1 and the empirical PDF and CDF, select one distribution to fit to each variable. For $H$, select between Exponential or Gaussian distribution, while for $T$ choose between Uniform or Gumbel.
3. Fitting a distribution¶
Task 4:
Fit the selected distributions to the observations using MLE.
Hint: Use Scipy built in functions (watch out with the parameters definition!).
#Your code here
4. Assessing goodness of fit¶
Task 5:
Assess the goodness of fit of the selected distribution using:
Hint: You have Kolmogorov-Smirnov test implemented in Scipy.
#Your code here
Task 6:
Interpret the results of the GOF techniques. How does the selected parametric distribution perform?
5. Propagating the uncertainty¶
Using the fitted distributions, we are going to propagate the uncertainty from $H$ and $T$ to $F_h$ assuming that $H$ and $T$ are independent.
Task 7:
Draw 10,000 random samples from the fitted distribution functions for $H$ and $T$.
Compute $F_h$ for each pair of samples.
Compute $F_h$ for the observations.
Plot the PDF and exceedance curve in logscale of $F_h$ computed using both the simulations and the observations.
# Here, the solution is shown for the Lognormal distribution
# Draw random samples
rs_H = #your code here
rs_T = #your code here
#Compute Fh
rs_Fh = #your code here
#repeat for observations
Fh = #your code here
#plot the PDF and the CDF
Task 8:
Interpret the figures above, answering the following questions:
- Are there differences between the two computed distributions for $F_h$?
- What are the advantages and disadvantages of using the simulations?
If you run the code in the cell below, you will obtain a scatter plot of both variables. Explore the relationship between both variables and answer the following questions:
Task 9:
Observe the plot below. What differences do you observe between the generated samples and the observations?
Compute the correlation between $H$ and $T$ for the samples and for the observartions. Are there differences?
What can you improve into the previous analysis? Do you have any ideas/suggestions on how to implement those suggestions?
fig, axes = plt.subplots(1, 1, figsize=(7, 7))
axes.scatter(rs_H, rs_T, 40, 'k', label = 'Simulations')
axes.scatter(H, T, 40, 'r','x', label = 'Observations')
axes.set_xlabel('Wave height, H (m)')
axes.set_ylabel('Wave period, T (s)')
axes.legend()
axes.grid()
#Correlation coefficient calculation here
End of notebook.