Two main approaches to functional arrays may be discerned: incremental and monolithic definition. In the incremental case, we have a function that produces an empty array of a given size and another that takes an array, an index, and a value, producing a new array that differs from the old one only at the given index. Obviously, a naive implementation of such an array semantics would be intolerably inefficient, either requiring a new copy of an array for each incremental redefinition, or taking linear time for array lookup; thus, serious attempts at using this approach employ sophisticated static analysis and clever run-time devices to avoid excessive copying. The monolithic approach, on the other hand, constructs an array all at once, without reference to intermediate array values. Although Haskell has an incremental array update operator, the main thrust of the array facility is monolithic.
Arrays are not part of the Standard Prelude---the standard library contains the array operators. Any module using arrays must import the Array module.
The range operation takes a bounds pair and produces the list of indices lying between those bounds, in index order. For example,
range (0,4) => [0,1,2,3,4]
range ((0,0),(1,2)) => [(0,0), (0,1), (0,2), (1,0), (1,1), (1,2)]
The inRange predicate determines whether an index lies between a given pair of bounds. (For a tuple type, this test is performed component-wise.) Finally, the index operation allows a particular element of an array to be addressed: given a bounds pair and an in-range index, the operation yields the zero-origin ordinal of the index within the range; for example:
index (1,9) 2 => 1
index ((0,0),(1,2)) (1,1) => 4
Array subscripting is performed with the infix operator !, and the bounds of an array can be extracted with the function bounds:
squares!7 => 49
bounds squares => (1,100)
We might generalize this example by parameterizing the bounds and the
function to be applied to each index:
mkArray :: (Ix a) => (a -> b) -> (a,a) -> Array a b
mkArray f bnds = array bnds [(i, f i) | i <- range bnds]
Thus,
we could define squares as
mkArray (\i -> i * i) (1,100).
Many arrays are defined recursively; that is, with the values of some
elements depending on the values of others. Here, for example, we have a
function returning an array of Fibonacci numbers:
fibs :: Int -> Array Int Int
fibs n = a where a = array (0,n) ([(0, 1), (1, 1)] ++
[(i, a!(i-2) + a!(i-1)) | i <- [2..n]])
Another
example of such a recurrence is the n by n wavefront
matrix, in which elements of the first row and first column all have the
value 1 and other elements are sums of their neighbors to the west,
northwest, and north:
wavefront :: Int -> Array (Int,Int) Int
wavefront n = a where
a = array ((1,1),(n,n))
([((1,j), 1) | j <- [1..n]] ++
[((i,1), 1) | i <- [2..n]] ++
[((i,j), a!(i,j-1) + a!(i-1,j-1) + a!(i-1,j))
| i <- [2..n], j <- [2..n]])
The
wavefront matrix is so called because in a parallel implementation, the
recurrence dictates that the computation can begin with the first row and column
in parallel and proceed as a wedge-shaped wave, traveling from northwest to
southeast. It is important to note, however, that no order of computation is
specified by the association list.
In each of our examples so far, we have given a unique association for each index of the array and only for the indices within the bounds of the array, and indeed, we must do this in general for an array be fully defined. An association with an out-of-bounds index results in an error; if an index is missing or appears more than once, however, there is no immediate error, but the value of the array at that index is then undefined, so that subscripting the array with such an index yields an error.
We complete our introduction to Haskell arrays with the familiar example of
matrix multiplication, taking advantage of overloading to define a fairly
general function. Since only multiplication and addition on the element type of
the matrices is involved, we get a function that multiplies matrices of any
numeric type unless we try hard not to. Additionally, if we are careful to apply
only (!) and the operations of Ix to indices, we get
genericity over index types, and in fact, the four row and column index types
need not all be the same. For simplicity, however, we require that the left
column indices and right row indices be of the same type, and moreover, that the
bounds be equal:
matMult :: (Ix a, Ix b, Ix c, Num d) =>
Array (a,b) d -> Array (b,c) d -> Array (a,c) d
matMult x y = array resultBounds
[((i,j), sum [x!(i,k) * y!(k,j) | k <- range (lj,uj)])
| i <- range (li,ui),
j <- range (lj',uj') ]
where ((li,lj),(ui,uj)) = bounds x
((li',lj'),(ui',uj')) = bounds y
resultBounds
| (lj,uj)==(li',ui') = ((li,lj'),(ui,uj'))
| otherwise = error "matMult: incompatible bounds"
As
an aside, we can also define matMult using accumArray,
resulting in a presentation that more closely resembles the usual formulation in
an imperative language:
matMult x y = accumArray (+) 0 resultBounds
[((i,j), x!(i,k) * y!(k,j))
| i <- range (li,ui),
j <- range (lj',uj')
k <- range (lj,uj) ]
where ((li,lj),(ui,uj)) = bounds x
((li',lj'),(ui',uj')) = bounds y
resultBounds
| (lj,uj)==(li',ui') = ((li,lj'),(ui,uj'))
| otherwise = error "matMult: incompatible bounds"
We can generalize further by making the function higher-order, simply
replacing sum and (*) by functional parameters:
genMatMult :: (Ix a, Ix b, Ix c) =>
([f] -> g) -> (d -> e -> f) ->
Array (a,b) d -> Array (b,c) e -> Array (a,c) g
genMatMult sum' star x y =
array resultBounds
[((i,j), sum' [x!(i,k) `star` y!(k,j) | k <- range (lj,uj)])
| i <- range (li,ui),
j <- range (lj',uj') ]
where ((li,lj),(ui,uj)) = bounds x
((li',lj'),(ui',uj')) = bounds y
resultBounds
| (lj,uj)==(li',ui') = ((li,lj'),(ui,uj'))
| otherwise = error "matMult: incompatible bounds"
APL
fans will recognize the usefulness of functions like the following:
genMatMult maximum (-)
genMatMult and (==)
With
the first of these, the arguments are numeric matrices, and the (i,j)-th
element of the result is the maximum difference between corresponding elements
of the i-th row and j-th column of the inputs. In the second case,
the arguments are matrices of any equality type, and the result is a Boolean
matrix in which element (i,j) is True if and only if the
i-th row of the first argument and j-th column of the second are
equal as vectors.
Notice that the element types of genMatMult need not be the same, but merely appropriate for the function parameter star. We could generalize still further by dropping the requirement that the first column index and second row index types be the same; clearly, two matrices could be considered conformable as long as the lengths of the columns of the first and the rows of the second are equal. The reader may wish to derive this still more general version. (Hint: Use the index operation to determine the lengths.)