During my internship at Tweag, I got the opportunity to work on the GHC Haskell compiler, under the mentorship of Richard Eisenberg. For the first third of my internship, I tackled the implementation of Proposal 99. This proposal introduces additional syntax to the language, allowing programmers to manually annotate type variables with their specificity.

In this blog post, I will describe specificity, the proposal features, and why it could prove useful to you as a developer. In order to tackle this first question, let us look at the title of the proposal: "Explicit specificity in type variable binders".

Specificity?

In order to explain what specificity means, we first have to take a step back and look at type applications.

Type Applications

The TypeApplications language extension was introduced in GHC 8.0.1, in 2016, and is currently used by over 900 packages on Hackage. The original paper by Richard Eisenberg et al. does a great job of motivating and explaining the feature, but it boils down to this (example taken from the paper):

Imagine we want to write a function normalize which parses a String and then pretty-prints the result. The function could for example, remove any redundant brackets in expressions. Writing this function however, is not trivial. A first version could look like this, using the predefined read :: forall a. Read a => String -> a and show :: forall a. Show a => a -> String:

normalize :: forall a. (Show a, Read a) => String -> String
normalize s = show (read s)


However, this code is (rightly) rejected, as GHC can't infer the output type of read, making the code ambiguous. We can solve this by manually instantiating the polymorphic type of read as follows:

normalize :: forall a. (Show a, Read a) => String -> String
normalize s = show (read @a s)


This instantiates the return type of read to be the type variable a as declared in the type signature of normalize. Note that as the type has yet to be instantiated, visible type application is required at the call site.

Specificity

So now that we know what visible type application looks like, let's give it a shot for ourselves. We'll just enable the language flag, write the first simple polymorphic function that comes to mind, and try instantiating it:

{-# LANGUAGE TypeApplications #-}

module Main where

id' x = x

id_int = id' @Int

main = putStrLn "Hi there!"


Let's load it up in GHCi:

/home/gertjan/Desktop/TypeApp.hs:7:10: error:
• Cannot apply expression of type ‘p0 -> p0’
to a visible type argument ‘Int’
• In the expression: id' @Int
In an equation for ‘id_int’: id_int = id' @Int
|
7 | id_int = id' @Int
|          ^^^^^^^^


Hmm, that's a bit strange. Instantiating the predefined id function works fine though:

id_int = id @Int


So what is really going on here? Well, let's ask GHCi to clarify by showing the types, setting -fprint-explicit-foralls so that we can see the type variable binders.

*Main> :set -fprint-explicit-foralls
*Main> :info id
id :: forall a. a -> a
*Main> :info id'
id' :: forall {p}. p -> p


What's happening here is that, because I did not provide a type annotation for my id' function, GHC effectively treats its type fundamentally differently from the type of the (annotated) predefined id function, defined as follows in base:

id :: a -> a
id x = x


Since the predefined version of id has a type signature which explicitly abstracts over the type variable a, this variable is marked as specified. On the other hand, our id' function does not have a type signature, making its type variable p inferred, as shown by the braces in the above example. So why make this distinction? Because the lack of a type signature makes inferred variable binders inherently a bit unstable. After all, who is to say that the next update of GHC won't alter the order of the inferred foralls, for example? For this reason, type application is limited to specified variables only: inferred type variables cannot be manually instantiated.

How do other programming languages handle this issue of unstable inferred type variables? Languages like Agda, Java and C++ do feature type instantiation, but as none of them automatically generalises functions without user-defined type signatures, generics or templates, respectively: they do not have inferred type variables. On the other hand, languages like ML do infer polymorphic types (with inferred type variables as a consequence), but do not have syntax for visible type application. Finally, Idris is the only language that I know of that both generalises and features type instantiation. As such, Idris takes a very similar approach to GHC's: while it does not show this distinction to the programmer, the compiler differentiates between types variables that arise from a user annotation and those that do not, and only allows instantiation of the former. You can find an example of this here.

Explicit Specificity?

To recap: writing a type signature makes the type variables specified, and thus available for instantiation. While the rule seems sensible, this is not always what we might want.

Consider the following type signature:

{-# LANGUAGE PolyKinds, KindSignatures #-}

import Type.Reflection

typeRep :: Typeable (a :: k) => TypeRep (a :: k)


The example is adapted from A Reflection on Types by Peyton Jones et al. For the purposes of this blog post, it is not important to go into too much detail on typeRep, besides looking at its type variables. While writing this function, the programmer wants to annotate it as kind polymorphic, by explicitly mentioning the k kind variable. Unfortunately, the full type of typeRep now becomes forall k (a :: k). Typeable a => TypeRep a. The k variable has become specified, which means that in order to instantiate a, users of our function always have to instantiate k first. Inferring the kind is trivial, which makes having to write something like typeRep @Type @Int or even typeRep @_ @Int in order to instantiate a quite silly.

The new explicit specificity extension allows us to manually annotate type variables we want to act as inferred variables. We do this by placing the braces around the variables we want to be inferred:

typeRep' :: forall {k} (a :: k). Typeable a => TypeRep a


In summary

Writing a type signature for your functions is always a good idea, as it serves both as a check of your code and as documentation. Writing a type signature is also necessary for -XScopedTypeVariables. However, writing a type signature, in current GHC, forces all the variables occurring in the signature to be specified. This alters the function's interface! Explicit specificity remedies this and lets you write exactly the type signature you want.

The original proposal features additional use cases. I encourage anyone who might be interested to have a look at the proposal.

How can I use this?

The code is currently under review and will be released in a future GHC version. If you want to try some examples out for yourself, we have a Docker container available with this development build of GHC. Just clone the container with docker pull gertjanb/explicit-specificity-ghc.

Once the container has finished downloading, you can launch your new development GHC build using docker run -it gertjanb/explicit-specificity-ghc, and start experimenting. For example, a good way to start is by defining a simple function (don't forget to enable the -XRankNTypes extension):

Prelude> :set -XRankNTypes
Prelude> let foo :: forall a {b}. a -> b -> b ; foo x y = y


Play with the type signature, see what type GHCi assigns to foo, and try instantiating the type variables. Have a look at the GHC proposal to find out more, and see where the new syntax is and isn't allowed. Have fun!

Will my code break?

For regular Haskell code, the new feature is entirely backwards compatible with existing code bases, since it only introduces new syntax. The update won't change the behaviour of your code in any way, unless you decide to utilise the new syntax.

The same can't be said for Template Haskell code, unfortunately. As this new syntax is available in Template Haskell as well, the language AST had to be altered to allow for passing this additional information around. Concretely, TyVarBndrs are now annotated with a flag to store additional information. In the case of forall-types and data constructors, they are annotated with their Specificity which is either SpecifiedSpec or InferredSpec. Updating your code should thus be as simple as following the type-checker and updating the types you wrote with the correct flag.

Closing Remarks

I'm grateful to Tweag for supporting this internship, for encouraging me to work on exciting projects like this and for introducing me to interesting people who share my passion for functional programming. I'd also like to thank Richard Eisenberg for his mentorship, his insights and his enthusiasm for the topic. Finally, I hope you, the reader, enjoyed reading this post. Expect a follow-up post to be arriving soon! :)