a tutorialMaking two garbage collectors be good neighbours

(using linear types)Parallelising your array code

27 September 2017

|*This is the third post in a series about array programming in Haskell — you might be interested in the first and second, too.*

In the previous post of this series, we discussed commonly used vector and matrix routines, which are available in highly-optimised implementations in most programming languages. However, often we need to implement our own custom array algorithms. To this end, the Haskell standard (Haskell 2010) already comes with a simple array API in the form of the `Data.Array`

standard module. These arrays are *immutable*, *boxed*, and *non-strict*. This allows for the elegant, high-level description of many array algorithms, but it is suboptimal for compute-intensive applications as boxing and non-strictness, especially in combination with the reliance on association lists for array construction, lead to code that is very difficult for the Haskell compiler to optimise, resulting in rather limited performance.

The extra expressiveness granted by non-strictness is nicely displayed by the following example, courtesy of the classic A Gentle Introduction to Haskell. The `wavefront`

function defines an *n*x*n* array whose leftmost column and topmost row are populated with 1s (by the first two list comprehensions below). All other array elements are defined as the sum of the three elements to their left, top, and top-left (by the third list comprehension in the definition).

```
wavefront :: Int -> Array (Int,Int) Int
wavefront n = arr
where
arr = array ((1,1),(n,n))
([((1,j), 1) | j <- [1..n]] ++
[((i,1), 1) | i <- [2..n]] ++
[((i,j), arr!(i,j-1) + arr!(i-1,j-1) + arr!(i-1,j))
| i <- [2..n], j <- [2..n]])
```

The `!`

infix operator implements array indexing:

```
(!) :: Ix i => Array i e -> i -> e
```

Moreover, the function `array`

constructs an array from an association list mapping indexes to values:

```
array :: Ix i => (i, i) -> [(i, e)] -> Array i e
```

The elegance of `wavefront`

is in the recursive definition of the array `arr`

. In the expression `arr!(i,j-1) + arr!(i-1,j-1) + arr!(i-1,j)`

, we access the elements to the left, top, and top-left of the current one by appropriate indexing of the very array that we are currently in the process of defining. Such a recursive dependency is only valid for a non-strict data structure.

Unfortunately, the expressiveness of non-strict arrays comes at a price, especially if the array elements are simple numbers. Instead of being able to store those numeric elements in-place in the array, non-strict arrays require a *boxed* representation, where the elements are pointers to heap objects containing the numeric values. This additional indirection requires extra memory and drastically reduces the efficiency of array access, especially in tight loops. The layout difference between an unboxed (left) and a boxed (right) representation is illustrated below.

While both strict and non-strict data structures admit boxed representations, non-strict structures typically require boxing. To provide an alternative to the standard non-strict arrays, the `array`

package provides strict, unboxed arrays of type `Data.Array.Unboxed.UArray i e`

. By way of overloading via the type class `Data.Array.IArray.IArray`

, they provide the same API as the standard non-strict, boxed arrays. However, the element type is restricted to basic types that can be stored unboxed, such as integral and floating-point numeric types.

Unfortunately, array construction based on association lists, such as the `array`

function, still severely limits the performance of immutable `UArray`

s. Nevertheless, at least, array access by way of `(!)`

is efficient for unboxed arrays.

While immutable arrays —i.e., arrays that cannot directly be in-place updated— are semantically simpler, it turns out that indexed-based array construction is drastically more efficient for mutable arrays. Hence, computationally demanding Haskell array code typically adopts a two-phase array life cycle: (1) arrays are allocated as mutable arrays and initialised using in-place array update; once initialised, (2) they are *frozen* by making them immutable.

This usage pattern is supported by the interface provided by the `Data.Array.MArray.MArray`

class and we use it in the following example function `generate`

to initialise an array of `Double`

s with an index-based generator function `gen`

:

```
generate :: Ix i => (i, i) -> (i -> Double) -> UArray i Double
generate bnds gen
= runSTUArray $ do
arr <- newArray_ bnds
mapM_ (\i -> writeArray arr i (gen i)) (range bnds)
return arr
```

Mutable arrays come in various flavours that are, in particularly, distinguished by the monad in which the array operations take place. Usually, this is either `IO`

or `ST`

, and the `array`

package provides both boxed and unboxed variants for both monads. We have the boxed `IOArray`

and the unboxed `IOUArray`

as well as the boxed `STArray`

and the unboxed `STUArray`

. The above definition of `generate`

uses `STUArray`

to initialise the array, and then, freezes it into a `UArray`

, which is returned. The choice of `STUArray`

is implicit in the use of `runSTUArray`

, which executes the code in the state transformer monad `ST`

and freezes the `STUArray`

into a `UArray`

.

The function `newArray_`

provides a fresh *uninitialised* array:

```
newArray_ :: (MArray a e m, Ix i) => (i, i) -> m (a i e)
```

We can read and write a mutable array with

```
readArray :: (MArray a e m, Ix i) => a i e -> i -> m e
writeArray :: (MArray a e m, Ix i) => a i e -> i -> e -> m ()
```

In the definition of `generate`

, we use `writeArray`

once on each index in the range `range bnds`

of the mutable array to initialise the value at index `i`

with the value of the generator function at that index, `gen i`

.

Generally, we can freeze a mutable array, obtaining an immutable array, with

```
freeze :: (Ix i, MArray a e m, IArray b e) => a i e -> m (b i e)
```

There is also the unsafe variant `unsafeFreeze`

that avoids copying the array during freezing, but puts the onus on the programmer to ensure that the mutable argument is subsequently not updated anymore. In the code for `generate`

above, we indirectly use `unsafeFreeze`

by way of `runSTUArray`

. As `runSTUArray`

makes it impossible to use the mutable array after freezing, this encapsulated use of `unsafeFreeze`

is always safe.

An expression, such as, `generate (1,100) ((^2) . fromIntegral)`

produces an unboxed array with the first one hundred square numbers. Internally, it is based on the initialisation of a mutable array, but this is completely abstracted over by the definition of `generate`

. While there is no inbuilt need to ever freeze a mutable array, good functional programming style requires to keep mutable code as localised as possible and to avoid passing mutable data structures around.

Well written code based on unboxed arrays and using the discussed pattern to create arrays by initialising a mutable version, which is subsequently frozen, can achieve performance comparable to low-level C code. In fact, the collection-oriented high-performance array frameworks that we will discuss in subsequent blog posts work exactly in this manner.