Tweag

Enter the matrix, Haskell style

31 August 2017 — by Manuel M T Chakravarty

This is the second post in a series about array programming in Haskell — you might be interested in the previous post, too.

Matrices are the bread and butter of most scientific and numeric computing. It is not surprising then that there is a range of standard libraries and interfaces, first and foremost BLAS (Basic Linear Algebra Subprograms) and LAPACK — Linear Algebra PACKage, which have been around since the FORTRAN days, and more recently also the GSL (GNU Scientific Library).

In Haskell, hmatrix provides a uniform interface to much of the functionality of these three libraries. The functionality is split over four packages, hmatrix (linear algebra), hmatrix-gsl (common numeric computations), hmatrix-gsl-stats(GSL statistics), and hmatrix-special (the ”special” functions of GSL). Due to the popularity of hmatrix, there exists a whole ecosystem of packages on Hackage that either build on hmatrix , provide bindings to other standard C libraries by extending the hmatrix interface, and implement adaptors to interoperate with other array libraries (such as hmatrix-repa).

Vectors and matrices

At the core of hmatrix are the Vector and Matrix data types,

data Vector e
data Matrix e

The Vector type is based on the Storable instances of the widely used vector package and Matrix adopts a matrix representation that enables the efficient use of BLAS routines. Values of both types can be easily created from lists and come with the expected basic vector and matrix functions. For example, we have

hmatrix> vector [1,2,3] * vector [3,0,-2]
[3.0,0.0,-6.0]

Although, the result is displayed in list syntax, it is indeed a Vector.

hmatrix> :t vector [1,2,3] * vector [3,0,-2]
vector [1,2,3] * vector [3,0,-2] :: Vector R

The element type R is one of several type synonyms used in some of the hmatrix interface:

type R = Double
type C = Complex Double
type I = CInt
type Z = Int64

Similarly, for matrices, we have

hmatrix> matrix 3 [1..9] * ident 3
(3><3)
 [ 1.0, 0.0, 0.0
 , 0.0, 5.0, 0.0
 , 0.0, 0.0, 9.0 ]
hmatrix> :t matrix 3 [1..9] * ident 3
matrix 3 [1..9] * ident 3 :: Matrix R

where matrix 3 turns a list into a matrix with 3 columns and ident produces the identity matrix of a given size.

Generalised matrices

In addition to the dense Vector and Matrix types, hmatrix also provides a ”general” matrix type GMatrix that provides optimised representations for dense, sparse, diagonal, banded, and constant matrices of Doubles (aka R). For example, the following sparse 2x2000 matrix with two non-zero elements is represented as follows in the compressed sparse row (CSR) format:

hmatrix> mkSparse [((0,999),1.0),((1,1999),2.0)]
SparseR
{ gmCSR = CSR
          { csrVals = [1.0,2.0]
          , csrCols = [1000,2000]
          , csrRows = [1,2,3]
          , csrNRows = 2
          , csrNCols = 2000
          }
, nRows = 2
, nCols = 2000
}

The package includes an implementation of the conjugate gradient method for sparse linear systems.

Functionality

In addition to the basic operations on vectors and matrices, the core hmatrix package provides solvers for linear systems, computes inverses, determinants, singular value decomposition, eigenvalues & eigenvectors, QR, Cholesky & LU factorisation, and some other common matrix operations.

On top of that hmatrix-gsl covers integration, differentiation, FFT, solving general polynomial equations, minimization of a multidimensional functions, multidimensional root finding, ordinary differential equations, nonlinear least-squares fitting, and interpolation routines. Moreover, hmatrix-gsl-stats includes random distribution functions, linear regression functions, histograms, permutations, and common statistics functions (mean, variance, standard deviation, and so on).

Finally, hmatrix-special provides Airy, Bessel, Clausen, Coulomb wave, coupling coefficient, Dawson, Debye, dilogarithm, elliptic integral, Jacobian elliptic, Fermi-Dirac integral, Gegenbauer, hypergeometric, Laguerre, Lambert W, synchrotron, digamma, trigamma, and transport, Riemann zeta functions as well as Gamma distributions, Legendre polynomials, and common trigonometric and exponential functions.

An example: minima of arbitrary multidimensional functions

For example, to find the minimum of a function

f [x,y] = 10*(x-1)^2 + 20*(y-2)^2 + 30

without providing a gradient, we can define

minimizeS :: ([Double] -> Double) -> [Double] -> ([Double], Matrix Double)
minimizeS f xi
  = minimize NMSimplex2 1E-2 100 (replicate (length xi) 1) f xi

using the minimize function of hmatrix-gsl’s Numeric.GSL.Minimization. It provides us with the minimum as well as the path taken by the algorithm to reach that solution. Using gnuplot by way of hmatrix’ undocumented Graphics.Plot interface, to visualise the path, we get

Minimise Plot

For the full example code, see minimize.hs.

Type safety

A common mistake in array programming is to apply an operation to one or more matrices whose dimensions violate a precondition of the operation. For example, for matrix multiplication to be well-defined for two matrices a and b, we require a to be ixj if b is jxk; i.e., the number of a’s columns needs to coincide with the number of b’s rows. Like most array libraries in Haskell, we cannot express such a constraint in hmatrix’ standard interface, where the type of a matrix is independent of the size of its dimensions.

The static interface of matrix provides some of hmatrix’ functionality with an alternative API, based on GHC’s type literals extension, which allows matrix size constraints to be expressed. We will discuss this interface in more detail in a future blog post.

Context

If you are coming from Python, then hmatrix will be the closest to numpy and scipy that you will find in the Haskell ecosystem. The packages in both languages are realised by wrapping highly optimised standard routines written in low-level languages (such as BLAS and LAPACK) as libraries. However, the functionality they provide is not directly comparable as, despite a strong overlap, both provide functionality that is absent in the other. While numpy and scipy shine with maturity and very widespread use, Haskell offers increased safety and the potential to provide additional high-performance functionality and fusion of computational kernels in Haskell itself (i.e., without the need to drop down to C or C++) by combining hmatrix with some of the array libraries that we will discuss in the forthcoming posts in this series.

About the authors
Manuel M T Chakravarty

If you enjoyed this article, you might be interested in joining the Tweag team.

This article is licensed under a Creative Commons Attribution 4.0 International license.

Company

AboutOpen SourceCareersContact Us

Connect with us

© 2024 Modus Create, LLC

Privacy PolicySitemap