or: How I learned to Stop Worrying and Love Data and Control

A Bayesian Neural NetworkOn linear types and

exceptionsTweag Fellowships:

Funding for Open Source ContributorsSafe memory management in inline-java

using linear typesLocating Performance Bottlenecks in

Large Haskell codebases

16 January 2020 |

|Haskell's `Data`

and `Control`

module
hierarchies have always bugged me. They feel arbitrary. There's `Data.Functor`

*and*
`Control.Monad`

—why? Monads are, after all, functors. They should
belong to the same hierarchy!

I'm not that person anymore. Now, I understand that the intuition
behind the Data/Control separation is rooted in a deep
technical justification. But—you rightly insist—monads are
*still* functors! So what's happening here? Well, the truth is that there
*are* two different kinds of functors. But you could never tell them
apart because they coincide in regular Haskell.

But they are different—so let's split them into two kinds: *data* functors and *control*
functors. We can use linear-types
to show why they are different. Let's get started.

If you haven't read about linear types, you may want to check out Tweag's other posts on the topic.
Notwithstanding, here's a quick summary: linear
types introduce a new type `a ⊸ b`

of *linear functions*. A linear
function is a function that, roughly, uses its argument exactly
once.

With that in mind, let's consider a prototypical functor: lists.

```
instance Functor [] where
fmap :: (a -> b) -> [a] -> [b]
fmap f [] = []
fmap f (a:l) = (f a) : (fmap f l)
```

How could we give it a linear type?

- Surely, it's ok to take a linear function as an argument (if
`fmap`

works on any function, it will work on functions which happen to be linear). - The
`f`

function is, on the other hand, not used linearly: it's used once per element of a list (of which there can be many!). So the second arrow must be a regular arrow. - However, we are calling
`f`

on each element of the list exactly once. So it makes sense to make the rightmost arrow linear—exactly once.

So we get the following alternative type for list's `fmap`

:

```
fmap :: (a ⊸ b) -> [a] ⊸ [b]
```

List is a functor because it is a container of data. It is a *data functor*.

```
class Data.Functor f where
fmap :: (a ⊸ b) -> [a] ⊸ [b]
```

Some data functors can be extended to applicatives:

```
class Data.Applicative f where
pure :: a -> f a
(<*>) :: f (a ⊸ b) ⊸ f a ⊸ f b
```

That means that containers of type `f a`

can be zipped together. It
also constrains the type of `pure`

: I typically need more than one
occurrence of my element to make a container that can be
zipped with something else. Therefore `pure`

can't be linear.

As an example, vectors of size 2 are data applicatives:

```
data V2 a = V2 a a
instance Data.Functor f where
fmap f (V2 x y) = V2 (f x) (f y)
instance Data.Applicative f where
pure x = V2 x x
(V2 f g) <*> (V2 x y) = V2 (f x) (g y)
```

Lists would almost work, too, but there is no linear way to zip
together two lists of different sizes. Note: such an instance would correspond to
the `Applicative`

instance of `ZipList`

in `base`

, the `Applicative`

instance for `[]`

, in `base`

is definitely not linear (left as an
exercise to the reader).

The story takes an interesting turn when considering monads. There is only one reasonable type for a linear monadic bind:

```
(>>=) :: m a ⊸ (a ⊸ m b) ⊸ m b
```

Any other choice of linearization and you will either get no linear
values at all (if the continuation is given type `a -> m b`

), or you
can't use linear values anywhere (if the two other arrows are
non-linear). In short: if you want the do-notation to work, you need
monads to have this precise type.

Now, you may remember that, from `(>>=)`

alone, it is possible to
derive `fmap`

:

```
fmap :: (a ⊸ b) ⊸ m a ⊸ m b
fmap f x = x >>= (\a -> return (f a))
```

But wait! Something happened here: all the arrows are linear! We've
just discovered a new kind of functor! Rather than containing data,
we see them as
wrapping a result value with an effect. They are *control functors*.

```
class Control.Functor m where
fmap :: (a ⊸ b) ⊸ m a ⊸ m b
class Control.Applicative m where
pure :: a ⊸ m a -- notice how pure is linear, but Data.pure wasn't
(<*>) :: m (a ⊸ b) ⊸ m a ⊸ m b
class Control.Monad m where
(>>=) :: (>>=) :: m a ⊸ (a ⊸ m b) ⊸ m b
```

Lists are *not* one of these. Why? Because you cannot map over a list with
a single use of the function! (Neither is `Maybe`

because you may drop
the function altogether, which is not permitted either.)

The prototypical example of a control functor is linear `State`

```
newtype State s a = State (s ⊸ (s, a))
instance Control.Functor (State s) where
fmap f (State act) = \s -> on2 f (act s)
where
on2 :: (a ⊸ b) ⊸ (s, a) ⊸ (s, b)
on2 g (s, a) = (s, g b)
```

There you have it. There indeed are two kinds of functors: *data* and *control*.

**Data functors**are containers: they contain many values; some are data applicatives that let you zip containers together.**Control functors**contain a single value and are all about effects; some are monads that the do-notation can chain.

That is all you need to know. Really.

But if you want to delve deeper, follow me to the next section because there is, actually, a solid mathematical foundation behind it all. It involves a branch of category theory called enriched category theory.

Either way, I hope you enjoyed the post and learned lots. Thanks for reading!

Briefly, in a category, you have a collection of objects and sets of morphisms between them. Then, the game of category theory is to replace sets in some part of mathematics, by objects in some category. For example, one can substitute “set” in the definition of group by topological space (topological group) or by smooth manifold (Lie group).

Enriched category theory is about playing this game on the definition of categories itself: a category enriched in \(\mathcal{C}\) has a collection of objects and objects-of-\(\mathcal{C}\) of morphisms between them.

For instance, we can consider categories enriched in abelian groups: between each pair of objects there is an abelian group of morphisms. In particular, there is at least one morphism, 0, between each pair of objects. The category of vector spaces over a given field (and, more generally, of modules over a given ring) is enriched in abelian groups. Categories enriched in abelian groups are relevant, for instance, to homology theory.

There is a theorem that all symmetric monoidal closed categories (of which the category of abelian groups is an example) are enriched in themselves. Therefore, the category of abelian groups itself is another example of a category enriched in abelian groups. Crucially for us, the category of types and linear functions is also symmetric monoidal closed. Hence is enriched in itself!

Functors can either respect this enrichment (in which case we say that they are enriched functors) or not. In the category Hask (seen as a proxy for the category of sets), this theorem is just saying that all functors are enriched because “Set-enriched functor” means the same as “regular functor”. That's why Haskell without linear types doesn't need a separate enriched functor type class.

In the category of abelian groups, the functor which maps \(A\) to \(A\otimes A\) is an example of a functor which is not enriched: the map from \(A → B\) to \(A\otimes A → B\otimes B\), which maps \(f\) to \(f\otimes f\) is not a group morphism. But the functor from \(A\) to \(A\oplus A\) is.

Control functors are the enriched functors of the category of linear functions, while data functors are the regular functors.

Here's the last bit of insight: why isn't there a `Data.Monad`

? The
mathematical notion of a monad does apply perfectly well to data
functors—it just wouldn't be especially useful in Haskell. We need the monad to be
strong for things like the do-notation
to work correctly. But, as it happens, a strong functor is
the same as an enriched functor, so data monads aren't
strong. Except in Hask, of course, where data monads and control
monads, being the same, are, in particular, strong.