Group Assignment 1.7: Distribution Fitting¶

No description has been provided for this image No description has been provided for this image

CEGM1000 MUDE: Week 7, Friday Oct 18, 2024.

Case 1: Wave impacts on a crest wall¶

What's the propagated uncertainty? *How large will the horizontal force be?*

In this project, you have chosen to work on the uncertainty of wave periods and wave heights in the Alboran sea to estimate the impacts on a crest wall: a concrete element installed on top of mound breakwater. You have observations from buoys of the significant wave height ($H$) and the peak wave period ($T$) each hour for several years. As you know, $H$ and $T$ are hydrodynamic variables relevant to estimate wave impacts on the structure. The maximum horizontal force (exceeded by 0.1% of incoming waves) can be estimated using the following equation (USACE, 2002).

$$ F_h = \left( A_1 + A_2 \frac{H}{A_c} \right) \rho g C_h L_{0p} $$

where $A_1=-0.016$ and $A_2=0.025$ are coefficients that depend on the geometry of the structure, $A_c=3m$ is the elevation of the frontal berm of the structure, $\rho$ is the density of water, $g$ is the gravity acceleration, $C_h=2m$ is the crown wall height, and $L_{0p}=\frac{gT^2}{2\pi}$ is the wave length in deep waves. Thus, the previous equation is reduced to

$$ F_h = 255.4 H T^2 -490.4 T^2 $$

The goal of this project is:

  1. Choose a reasonable distribution function for $H$ and $T$.
  2. Fit the chosen distributions to the observations of $H$ and $T$.
  3. Assuming $H$ and $T$ are independent, propagate their distributions to obtain the distribution of $F_h$.
  4. Analyze the distribution of $F_h$.

Importing packages¶

In [ ]:
import numpy as np
import matplotlib.pyplot as plt

from scipy import stats 
from math import ceil, trunc

plt.rcParams.update({'font.size': 14})

1. Explore the data¶

First step in the analysis is exploring the data, visually and through its statistics.

In [ ]:
# Import
_, H, T = np.genfromtxt('dataset_HT.csv', delimiter=",", unpack=True, skip_header=True)

# plot time series
fig, ax = plt.subplots(2, 1, figsize=(10, 7), layout = 'constrained')
ax[0].plot(H,'k')
ax[0].set_xlabel('Time')
ax[0].set_ylabel('Wave height, H (m)')
ax[0].grid()

ax[1].plot(T,'k')
ax[1].set_xlabel('Time')
ax[1].set_ylabel('Water period, T (s)')
ax[1].grid()
In [ ]:
# Statistics for H

print(stats.describe(H))
In [ ]:
# Statistics for d

print(stats.describe(T))

Task 1:

Describe the data based on the previous statistics:

  • Which variable presents a higher variability?
  • What does the skewness coefficient means? Which kind of distribution functions should we consider to fit them?
  • 2. Empirical distribution functions¶

    Now, we are going to compute and plot the empirical PDF and CDF for each variable. Note that you have the pseudo-code for the empirical CDF in the reader.

    Task 2:

    Define a function to compute the empirical CDF. Plot your empirical PDF and CDF.

    In [ ]:
    def ecdf(YOUR_INPUTS):
        #your code
        return YOUR_OUTPUT
    
    In [ ]:
    # Your plot here
    

    Task 3:

    Based on the results of Task 1 and the empirical PDF and CDF, select one distribution to fit to each variable. For $H$, select between Exponential or Gaussian distribution, while for $T$ choose between Uniform or Gumbel.

    3. Fitting a distribution¶

    Task 4:

    Fit the selected distributions to the observations using MLE.

    Hint: Use Scipy built in functions (watch out with the parameters definition!).

    In [ ]:
    #Your code here
    

    4. Assessing goodness of fit¶

    Task 5:

    Assess the goodness of fit of the selected distribution using:

  • One graphical method: QQplot or Logscale. Choose one.
  • Kolmogorov-Smirnov test.
  • Hint: You have Kolmogorov-Smirnov test implemented in Scipy.

    In [ ]:
    #Your code here
    

    Task 6:

    Interpret the results of the GOF techniques. How does the selected parametric distribution perform?

    5. Propagating the uncertainty¶

    Using the fitted distributions, we are going to propagate the uncertainty from $H$ and $T$ to $F_h$ assuming that $H$ and $T$ are independent.

    Task 7:

    1. Draw 10,000 random samples from the fitted distribution functions for $H$ and $T$.

    2. Compute $F_h$ for each pair of samples.

    3. Compute $F_h$ for the observations.

    4. Plot the PDF and exceedance curve in logscale of $F_h$ computed using both the simulations and the observations.

    In [ ]:
    # Here, the solution is shown for the Lognormal distribution
    
    # Draw random samples
    rs_H = #your code here
    rs_T = #your code here
    
    #Compute Fh
    rs_Fh = #your code here
    
    #repeat for observations
    Fh = #your code here
    
    #plot the PDF and the CDF
    

    Task 8:

    Interpret the figures above, answering the following questions:

    • Are there differences between the two computed distributions for $F_h$?
    • What are the advantages and disadvantages of using the simulations?

    If you run the code in the cell below, you will obtain a scatter plot of both variables. Explore the relationship between both variables and answer the following questions:

    Task 9:

    1. Observe the plot below. What differences do you observe between the generated samples and the observations?

    2. Compute the correlation between $H$ and $T$ for the samples and for the observartions. Are there differences?

    3. What can you improve into the previous analysis? Do you have any ideas/suggestions on how to implement those suggestions?

    In [ ]:
    fig, axes = plt.subplots(1, 1, figsize=(7, 7))
    axes.scatter(rs_H, rs_T, 40, 'k', label = 'Simulations')
    axes.scatter(H, T, 40, 'r','x', label = 'Observations')
    axes.set_xlabel('Wave height, H (m)')
    axes.set_ylabel('Wave period, T (s)')
    axes.legend()
    axes.grid()
    
    In [ ]:
    #Correlation coefficient calculation here
    

    End of notebook.

    Creative Commons License TU Delft MUDE

    © Copyright 2023 MUDE Teaching Team TU Delft. This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.