scipy.stats
and 3D plots#
Part 1: Introduction and set up of the model#
In this programming assignment, you will work in creating plots in 3D and the projections of those plots in 2D using contours. These plotting tools are useful, for instance, when displaying the results of bivariate distribution.
Imagine you are a scientist interested in the atmospheric processes that lead to precipitation. You have been studying the relationship between the rain rate (\(RR\), mm/h) and the ice content in the clouds (\(IC\), kg/m2). You have determined that the bivariate distribution \(F(RR, IC)\) is well described using a bivariate Gaussian distribution with the following parameters: \(\mu_{RR} = 10\) mm/h and \(\sigma_{RR}=2\) mm/h, \(\mu_{IC} = 700\) kg/m2 and \(\sigma_{IC}=300\) kg/m2, and \(Cov(RR, IC)=300 \ mm/h \times kg/m2\).
import pandas as pd
import numpy as np
from scipy.stats import multivariate_normal
import matplotlib.pyplot as plt
\(\text{Task 3.1:}\)
Define a bivariate Gaussian distribution between \(RR\) and \(IC\) given the above information. Draw 100 random samples from the defined distribution and plot them.
Follow the following steps:
Define the vector of means
Define the covariance matrix
Define the bivariate Gaussian distribution with
scipy.stats.multivariate_normal
and draw 100 samples using thervs
method
# Define the vector of means
mu1 = ### YOUR CODE HERE ###
mu2 = ### YOUR CODE HERE ###
mu = [mu1, mu2] #vector of means
# Define the covariance matrix
s1 = ### YOUR CODE HERE ###
s2 = ### YOUR CODE HERE ###
covariance = ### YOUR CODE HERE ###
sigma = ### YOUR CODE HERE ###
# Draw 100 samples from a bivariate Gaussian distribution
samples = multivariate_normal(mean=mu, cov=sigma).rvs(size=100)
# Scatter plot against observations
### YOUR CODE LINES HERE
Part 2. 3D plots#
In order to create 3D plots, we usually need to evaluate a function over a 2D grid. To do so, we can use the function np.meshgrid
which takes two 1D arrays of values (usually representing axes x and y) and turns them into 2D grids, so you can evaluate functions over a grid of (x, y) points.
Let’s see it with an example!
Imagine that I want to calculate \(Z\) as function of \(A\) and \(B\), \(Z = A^2+B\), with uniformly distributed values of \(A\) and \(B\) and plot it spatially. Thus, \(A\) and \(B\) are the axis x and y, while \(Z\) is the z-axis. We need to generate a grid with values of \(A\) and \(B\) to evaluate \(Z\). We can do it as follows:
# Define the range of values for A and B
n = 5 #number of values or size of the mesh
values_A = np.linspace(0,30,n)
values_B = np.linspace(-5, 5,n)
# Define the grid
A2,B2 = np.meshgrid(values_A,values_B)
print(f"The values to evaluate A are: {values_A}")
print(f"The values to evaluate B are: {values_B}\n")
print(f"The grid defined for A is: {A2}")
print(f"The shape of A2 is: {A2.shape}\n")
print(f"The grid defined for B is: {B2}")
print(f"The shape of B2 is: {B2.shape}")
In the above prints, you can see that the values where we wanted to evaluate A have been repeated 5 times in rows while the values where we wanted to evaluate B have been repeated 5 times in columns to create a grid. This allows to cover all the combinations of values of A and B.
In order to move back from that grid to a 2d array (required for evaluating the function \(Z\)) with all the combinations of \(A\) and \(B\), we can use np.concatenate
to transform both A2 and B2 into a 1D array. Afterwards, it is possible to join them in a single array.
A2_1d = np.concatenate(A2.T)
print(f"The grid defined as a 1D-array for A is: {A2_1d}")
print(f"The shape of A2_1d is: {A2_1d.shape}\n")
B2_1d = np.concatenate(B2.T)
print(f"The grid defined as a 1D-array for B is: {B2_1d}")
print(f"The shape of B2_1d is: {B2_1d.shape}")
A2B2 = np.array([A2_1d, B2_1d]).T
print(f"The grid defined as a 2D-array for A and B is: \n{A2B2}")
print(f"The shape of A2B2 is: {A2B2.shape}")
Now we’ve converted our mesh into a 2D-array, we can evaluate the function \(Z = A^2+B\).
Z = A2B2[:,0]**2 + A2B2[:,1]
print(f"The values of Z evaluated at (A,B) points are: \n {Z}")
print(f"Shape of Z: {Z.shape}\n")
Once we have calculated \(Z\), we need the shape of the original meshgrid again for plotting! Luckily the .reshape
method comes to the rescue:
mesh_plot = Z.reshape(A2.shape).T
print(f"The grid defined as a mesh for plotting is: \n{mesh_plot}")
print(f"The shape of the mesh is: {mesh_plot.shape}")
\(\text{Task 3.3:}\)
Using the defined bivariate Gaussian distribution, but now compute and plot the bivariate PDF (using method pdf
) as a 3D plot where the x- and y-axis are the values of \(RR\) and \(IC\) and the z-axis are the densities.
\(\text{Tip:}\)
Hint 1: Use the function meshgrid of numpy to define a grid to evaluate the PDF.
Hint 2: use plot_surface of matplotlib to display the 3D plot.
# Define the mesh of values where we want to evaluate the random variables
n = 200 #size of the mesh
values_RR = np.linspace(### YOUR CODE HERE ###)
values_IC = np.linspace(### YOUR CODE HERE ###)
# Define the grid
X1,X2 = np.meshgrid(### YOUR CODE HERE ###)
X = ### YOUR CODE HERE ###
# Evaluate the PDF
Z = multivariate_normal(mean=mu, cov=sigma).pdf(X)
# Create the 3D plot
### YOUR CODE LINES HERE
\(\text{Task 3.4:}\)
Using the defined bivariate Gaussian distribution, compute and plot the bivariate PDF as contours where densities are projected. This is, the x- and y-axis are the values of \(RR\) and \(IC\) and the contours in the plot represent values of densities.
\(\text{Tip:}\)
Hint 1: you can reuse the grid defined in the previous task.
Hint 2: use contour of matplotlib to draw the contours.
### YOUR CODE LINES HERE
By Tom van Woudenberg and Patricia Mares Nasarre, Delft University of Technology. CC BY 4.0, more info on the Credits page of Workbook.