Case 3: Discharges on a structure¶
What's the propagated uncertainty? *How large will be the discharge?*
In this project, you have chosen to work on the uncertainty of water depths ($h$) and water velocities ($u$) on top of a hydraulic structure to estimate the discharge. You have observations from physical experiments of waves impacting a breakwater during a wave storm scaled up to prototype scale. You can further read on the dataset here. Remember that the discharge can be computed as
$$ q = u S $$where $S$ is the section the flow crosses. Thus, assuming a discharge width of 1m, we can simplify the previous equation as
$$ q = u h $$The goal of this project is:
- Choose a reasonable distribution function for $d$ and $h$.
- Fit the chosen distributions to the observations of $d$ and $h$.
- Assuming $d$ and $h$ are independent, propagate their distributions to obtain the distribution of $q$.
- Analyze the distribution of $q$.
Importing packages¶
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats
from math import ceil, trunc
plt.rcParams.update({'font.size': 14})
1. Explore the data¶
First step in the analysis is exploring the data, visually and through its statistics.
# Import
h, u = np.genfromtxt('dataset_hu.csv', delimiter=",", unpack=True, skip_header=True)
# plot time series
fig, ax = plt.subplots(2, 1, figsize=(10, 7), layout = 'constrained')
ax[0].plot(h,'k')
ax[0].set_xlabel('Time')
ax[0].set_ylabel('Water depth, h (m)')
ax[0].grid()
ax[1].plot(u,'k')
ax[1].set_xlabel('Time')
ax[1].set_ylabel('Water velocity, u (m/s)')
ax[1].grid()
# Statistics for h
print(stats.describe(h))
# Statistics for u
print(stats.describe(u))
Task 1:
Describe the data based on the previous statistics:
2. Empirical distribution functions¶
Now, we are going to compute and plot the empirical PDF and CDF for each variable. Note that you have the pseudo-code for the empirical CDF in the reader.
Task 2:
Define a function to compute the empirical CDF.
def ecdf(YOUR_INPUT:
#Your code
return YOUR_OUTPUT
#Your plot
Task 3:
Based on the results of Task 1 and the empirical PDF and CDF, select one distribution to fit to each variable. For $h$, select between Uniform or Gaussian distribution, while for $u$ choose between Exponential or Gumbel.
3. Fitting a distribution¶
Task 4:
Fit the selected distributions to the observations using MLE.
Hint: Use Scipy built in functions (watch out with the parameters definition!).
#Your code here
4. Assessing goodness of fit¶
Task 5:
Assess the goodness of fit of the selected distribution using:
Hint: You have Kolmogorov-Smirnov test implemented in Scipy.
#Your code here
Task 6:
Interpret the results of the GOF techniques. How does the selected parametric distribution perform?
5. Propagating the uncertainty¶
Using the fitted distributions, we are going to propagate the uncertainty from $h$ and $u$ to $q$ assuming that $h$ and $u$ are independent.
Task 7:
Draw 10,000 random samples from the fitted distribution functions for $h$ and $u$.
Compute $q$ for each pair of samples.
Compute $q$ for the observations.
Plot the PDF and exceedance curve in logscale of $q$ computed using both the simulations and the observations.
# Here, the solution is shown for the Lognormal distribution
# Draw random samples
rs_h = #Your code here
rs_u = #Your code here
#Compute Fh
rs_q = #Your code here
#repeat for observations
q = #Your code here
#plot the PDF and the CDF
Task 8:
Interpret the figures above, answering the following questions:
- Are there differences between the two computed distributions for $q$?
- What are the advantages and disadvantages of using the simulations?
If you run the code in the cell below, you will obtain a scatter plot of both variables. Explore the relationship between both variables and answer the following questions:
Task 9:
Observe the plot below. What differences do you observe between the generated samples and the observations?
Compute the correlation between $h$ and $u$ for the samples and for the observartions. Are there differences?
What can you improve into the previous analysis? Do you have any ideas/suggestions on how to implement those suggestions?
fig, axes = plt.subplots(1, 1, figsize=(7, 7))
axes.scatter(rs_h, rs_u, 40, 'k', label = 'Simulations')
axes.scatter(h, u, 40, 'r','x', label = 'Observations')
axes.set_xlabel('Wave height, H (m)')
axes.set_ylabel('Wave period, T (s)')
axes.legend()
axes.grid()
#Correlation coefficient calculation here
End of notebook.