Sparse matrices with scipy.sparse

2.2. Sparse matrices with scipy.sparse#

Matrices for PDE simulations are usually large, up to the level that computers may run out of memory when trying to store the matrices for very large problems. For a problem with one million degrees of freedom, the matrix contains \(10^6\times10^6=10^{12}\) numbers. Storing \(10^{12}\) floating point numbers with 8 bytes each would blow up the memory requirement to 8 TB. However, these large matrices typically contain mostly zeros. For a large finite element problem, the number of nonzero values in one million by one million matrix is typically of the order \(10^7\) to \(10^8\). No matter where you are in that range, practically all of the 8 TB would be dedicated to storing zeros.

Fortunately, there are more efficient ways of storing large matrices with many zeros, or sparse matrices and these are an essential ingredient of PDE simulation software. In fact, sparse matrix storage is not only beneficial for memory requirements but also allows for much more efficient algorithms for linear algebra operations with those matrices. Again thinking of the \(10^6\times10^6\) matrix, performing a matrix vector multiplication with a vector of length \(10^6\) requires \(10^{12}\) multiplications. Quite some work, also for a fast computer, and a huge waste of effort if the matrix is sparse because the multiplications with zero anyway do not contribute to the result. Of course it is algorithmically a bit more complicated to store only the non-zero values of a matrix (to have all the information about the matrix you also need to know the position of the nonzero values!) and to perform linear algebra operations, but efficient libraries are available that do the heavy lifting for you. In this notebook, we will explore scipy.sparse.

Storage types: COO vs CSR and CSC#

There are different ways to store the data in a sparse matrix. The most intuitive one is the coordinate list or COO storage. This one is convenient for constructing matrices, but it is not the most efficient one for matrix operations. The COO format stores for every non-zero entry the row-index, the column-index and the value, i.e. 2 integers and 1 floating point number. By convention, the COO format allows for duplicate entries, that are meant to be summed together.

A more optimized alternative is the CSR (Compressed Sparse Row) format. In this format, non-zero values are stored in a vector after they have been grouped by row, another vector of equal length stores corresponding column indices, while a third vector stores pointers to the first non-zero entry of each row. The size of the third vector is therefore equal to the number of rows rather than to the number of non-zero entries. This generally offers a small reduction in required memory, but the main gain from this format is that it is very efficient to extract data from the matrix row-by-row, which is precisely what is needed for efficient matrix-vector product evaluation. A similar format is the CSC (Compressed Sparse Column) format which stores row indices and column pointers instead.

In the example below, we use the COO and CSR formats to store a matrix similar to the one that was used in the page on assembly and numpy._ix. Note that this is not a very representative example of a sparse matrix, it contains very few zeros, but here we are just illustrating how to use scipy.sparse and what data is stored.

import numpy as np
import scipy.sparse as sp

# Connectivity matrix
elements = np.array([ 
    [0, 1, 4], 
    [1, 2, 4], 
    [2, 3, 4], 
    ])

# Element matrix (simplified)
K_elem = np.array([
        [2.0, -1.0, -1.0],
        [-1.0, 2.0, -1.0],
        [-1.0, -1.0, 2.0]
    ])

n_nodes = np.max(elements)+1

# Create empty lists
row_indices = []
col_indices = []
data = []

for elem_nodes in elements:
    for i in range(3):
        for j in range(3):
            row_indices.append(elem_nodes[i])
            col_indices.append(elem_nodes[j])
            data.append(K_elem[i, j])

# Create sparse matrix using COO or CSR format
K_coo = sp.coo_array((data, (row_indices, col_indices)), 
                        shape=(n_nodes, n_nodes))
K_csr = sp.csr_array((data, (row_indices, col_indices)), 
                        shape=(n_nodes, n_nodes))

print("Dense matrix representation:")
print(K_coo.todense())

print(f"\n   size of COO matrix: {K_coo.size}\n   size of CSR matrix: {K_csr.size}")

print("\nData in the COO matrix contains row indices, column indices, and values:")
print(K_coo.row)
print(K_coo.col)
print(K_coo.data)

print("\nData in the CSR matrix contains column indices, row pointers, and values:")
print(K_csr.indices)
print(K_csr.indptr)
print(K_csr.data)

Exercise

Run the cell above to inspect the size of the COO and CSR matrices (i.e. how many different values are stored for the matrix). There are 27 values in the COO matrix and 19 values in the CSR matrix. What is the origin of these numbers (why 27, why 19)?

Solution

The size of the COO matrix is equal to the number values that have been defined for the matrix. We have performed assembly over three element, each time adding a \(3\times3\) matrix to the global one. That is in total \(3\cdot9=27\) values. Note that this is larger than the number of values (including zeros) in the dense matrix. This is because the matrix of this example does not have many zeros and duplicate entries are not summed in the COO format.

The size of the CSR matrix is equal to the number of nonzero entries in the matrix. The \(5\times5\) matrix has 6 zeros, so there are 25-6=19 nonzero entries.

Row manipulation with the CSR matrix#

When dealing with sparse matrix, you have to be careful not to accidentally make them denser. Generally, values that are set to 0 are not automatically removed. For example, indexing an entire row with [irow, :] is supported but may modify the sparsity.

By merit of how the data is stored in the CSR format, the csr_array allows for an efficient way for overwriting the data in a row. This is convenient for applying boundary conditions with the finite element method. The csc_array which stores the data by column is efficient for accessing data by column. Yet another format, the lil_array (from List Of Lists) does support intuitive slicing and even automatically updates the sparsity if values are set to zero. Temporarily converting to lil_array can be beneficial, even if you need to convert back to csr_array for further operations.

In the code below, a 5-by-5 identity matrix is constructed and then row 2 is set entirely to 0 in three different ways. Although the resulting matrix is mathematically equivalent, the efficiency of the storage is different.

I = np.eye(5)
I_csr = sp.csr_array(I)

I_csr_mod = I_csr.copy()
I_csr_mod[2, :] = 0.0   

I_csr_alt = I_csr.copy()
start, end = I_csr_alt.indptr[2], I_csr_alt.indptr[3]   # beginning and end of row 2
I_csr_alt.indices[start:end] = 0.0

I_lil = I_csr.copy().tolil()
I_lil[2, :] = 0.0   
I_csr_lil = I_lil.tocsr()

print('Size of I_csr:', I_csr.size)
print('Size of I_csr_mod:', I_csr_mod.size)
print('Size of I_csr_alt:', I_csr_alt.size)
print('Size of I_csr_lil:', I_csr_lil.size)

The code cell above performs the same operation in three different ways. To show what is the difference here, we visualize the matrices below, making the distinction between an explicit 0 (shown as \(0\)) and a default 0 (shown as an empty entry in the matrix).

In the first one, I_csr_mod, the entire second row is assigned explicit zero values:

\[\begin{split} \mathbf{I_\mathrm{csr,mod}} = \begin{bmatrix} 1 & & & & \\ & 1 & & & \\ 0 & 0 & 0 & 0 & 0 \\ & & & 1 & \\ & & & & 1 \end{bmatrix} \end{split}\]

In the second one, I_csr_alt, only the existing value in the second row is assigned as zero:

\[\begin{split} \mathbf{I_\mathrm{csr,alt}} = \begin{bmatrix} 1 & & & & \\ & 1 & & & \\ & & 0 & & \\ & & & 1 & \\ & & & & 1 \end{bmatrix} \end{split}\]

In the third one, I_csr_lil, in the code the entire row is zeroed, but here zero values, even the formerly existing entry, are removed from storage

\[\begin{split} \mathbf{I_\mathrm{csr,lil}} = \begin{bmatrix} 1 & & & & \\ & 1 & & & \\ & & & & \\ & & & 1 & \\ & & & & 1 \end{bmatrix} \end{split}\]

Of these three, the first should be avoided, even though for this variant the code looks most clean. The second and third can both be optimal, where it depends whether the user wants to optimize for memory requirements or for runtime, converting to LIL leads to the smallest matrix, but comes with some overhead.

Attribution

This chapter is written by Frans van der Meer. Find out more here.

Sparse matrices with scipy.sparse

Contents

2.2. Sparse matrices with scipy.sparse#

Storage types: COO vs CSR and CSC#

Row manipulation with the CSR matrix#