For a while now, Tweag has committed in improving various aspects of Liquid Haskell (LH), a tool that gives the Haskell programmer both the ability to express properties about programs and to verify that they meet these expectations.

In this post we present a specific improvement that I integrated recently, which cuts down the maintenance cost to use LH when introducing assumptions about functions coming from large or multiple packages.

## Problem: How to reason about dependencies

Let us suppose that we are trying to verify that the following function yields a non-empty list.

```
module M where
{-@ consPlus1 :: Int -> [Int] -> { v:[Int] | len v > 0 } @-}
consPlus1 :: Int -> [Int] -> [Int]
consPlus1 x xs = map (+1) (x:xs)
```

The definition starts with a comment dedicated to LH, delimited by `{-@ ... @-}`

.
It contains a type signature for `consPlus1`

with refinement types.
A refinement type denotes a subtype of another type,
characterized by a given predicate, like `len v > 0`

.
In this case the predicate of the result uses a function `len`

that
yields the length of a list in the logic. This predicate
states that `consPlus1`

yields a list with at least one element.

This function depends on the `map`

function from the `base`

library.
LH can infer that `consPlus1`

yields a non-empty list, if it knows that `map`

yields a non-empty list. Proving that `map`

yields a
non-empty list requires inspecting its definition. However, the
definition of `map`

is not in the current package, it comes from the `base`

dependency.

One option to reach a complete proof would be to add LH annotations to the
`base`

package, proving there that `map`

yields non-empty lists
when fed non-empty lists. This would require either convincing the
maintainers of `base`

to add these annotations, or creating a copy
of `base`

with the LH annotations and proofs that we need. Both
options are problematic.

If the maintainers of `base`

accept our LH annotations, they will be
expected to keep them up to date, as `base`

and LH evolve.
Moreover, they will need to figure out the most relevant set of
annotations and proofs worth maintaining among user inquiries.

If we copy the `base`

package and our project starts depending on
a modified dependency, we will be responsible for updating our copy
when the original `base`

package is modified. Moreover, if other
projects start using their own copy of `base`

, then we will end up
with a bunch of copies with different annotations each, which then
makes composing the projects a challenge.

The rest of the post is devoted to explain what LH did initially to improve upon this situation, and how I further improved it later on.

## Idea: Separating annotations and definitions

Proving that `map`

yields some property of interest does often
require its definition. But if we are satisfied with just
assuming that the property holds *without an actual proof*, then
we could tell so to LH at the place where we need to use such
an assumption, without any need to modify the dependency itself.

Here’s how we could write our assumption about `map`

.

`{-@ assume map :: (a -> b) -> xs:[a] -> { v:[a] | len xs = len v } @-}`

The assumption starts with the `assume`

keyword, indicating that
the predicate in the return type is not to be proved. Then
follows an assumed type signature for `map`

with refinement types.
The parameters of the function can have a name, like `xs`

, that
can be used in the predicates of later arguments and in the
result. The predicate of the return type states that the input and
the output have the same length.

This works fine if we need to use the assumption only once, but if the assumption is needed at multiple places, it is inconvenient to copy the assumption at each usage site.

Could we put the assumption at some place where LH could find
it when needed?
This is a problem similar to the one in Python, where authors might
want to add type signatures to the interfaces of
preexisting libraries in dependencies.
The first answer to this question was known as
*wrapper packages* and they were introduced at the time LH
became a GHC plugin in 2020.

## Old solution: Wrapper packages

A wrapper package is a package that reexports all the definitions
from an existing package, and adds assumptions
about them. In the case of `base`

, we would
have a wrapper package `liquid-base`

that contained modules using
the same names of modules in `base`

. A minimal wrapper package for our running example would be this:

```
{-# LANGUAGE QualifiedImports #-}
module GHC.Base(module X) where
import "base" GHC.Base as X
{-@ assume map :: (a -> b) -> { xs:[a] | len xs > 0 } -> { v:[a] | len v > 0 } @-}
```

The module `liquid-base:GHC.Base`

is a drop-in replacement for
`base:GHC.Base`

that can introduce assumptions about functions in
`GHC.Base`

. More generally, the `liquid-base`

package is a
drop-in replacement for the `base`

package, because it would
define similar reexporting modules for every module in `base`

.

When we are verifying a program that depends on `base`

, we remove `base`

from
the dependencies and use `liquid-base`

instead. What’s good about this approach is
that we do not need to change the import declarations in our program
to get the new assumptions, and the relevant assumptions are in
scope at the call site of every function. Thus our dependencies in
a hypothetical file `our_package.cabal`

would be:

```
...
build-depends:
liquid-base
```

A disadvantage of this approach, however, is that the
programmer must manually track that the right version
of `liquid-base`

is used to replace `base`

, and so on for every
dependency on which assumptions need to be introduced.

Another disadvantage is that `liquid-base`

must really reexport
definitions from all modules in `base`

even if we only need to
add a single assumption for `map`

. Otherwise, `liquid-base`

could not be a replacement for `base`

. This imposes a maintenance
overhead that is proportional to the number of dependencies and their sizes.
This overhead needs to be paid when first creating a wrapper
package and then again when updating it to accommodate changes
in the wrapped package.

Lastly, the wrapper packages pollute and obfuscate the dependencies of
a project. A package that depends on `base`

doesn’t show it
explicitly since it depends instead on `liquid-base`

. The reader
has to know somehow or trust that `liquid-base`

really exports
the same Haskell definitions as `base`

.

In order to address the above problems, I proposed and implemented an alternative mechanism to share assumptions.

## New solution: Packages with assumptions

Ideally, LH would be able to find assumptions without defining wrapper packages and without changing the original source code.

Since LH already allowed to put assumptions anywhere, the remaining problem was how to make LH find these assumptions. The user could straightforwardly add import declarations of the modules containing the assumptions, but it would be preferable to avoid modifying the original source code as much as possible. The new mechanism relies on a naming convention for modules that hold assumptions.

If the module we are verifying imports `GHC.Base`

, then with the
new mechanism LH will
look for assumptions in a module named `GHC.Base_LHAssumptions`

,
if it exists.

```
module GHC.Base_LHAssumptions where
import GHC.Base
{-@ assume GHC.Base.map :: (a -> b) -> { xs:[a] | len xs > 0 } -> { v:[a] | len v > 0 } @-}
```

And so on
for the rest of the imports: if module `X`

is imported, then extra
assumptions will be read from modules named `X_LHAssumptions`

if they
exists in any dependency of the current package.

We can thus provide a package `liquid-base-assumptions`

, which essentially
contains modules with assumptions. E.g.

```
GHC.Base_LHAssumptions
GHC.List_LHAssumptions
Prelude_LHAssumptions
...
```

Now, when we want to verify a package that depends on `base`

, we add a dependency on `liquid-base-assumptions`

to allow LH to find the `_LHAssumptions`

modules that it contains.
Dependencies could look as follows in `our_package.cabal`

:

```
...
build-depends:
base,
liquid-base-assumptions
```

If a module doesn’t export a function that needs assumptions, there
is no need to define a module `_LHAssumptions`

, and certainly no
need to define modules that just reexport definitions from elsewhere,
which is precisely the overhead we wanted to avoid.

An additional simplification is that we can save the user the
trouble of picking a version of the assumptions that matches the
`base`

version in use. This is because the version of `base`

is mostly dictated by the version of GHC that the project uses,
and LH knows which GHC version is in use when it is built.
More generally, LH can guess the assumptions to use for every
package whose version is tied to the compiler (i.e. it is a boot
package). Because of this, the `_LHAssumptions`

modules
corresponding to boot packages now come included in the
`liquidhaskell`

package. Then the dependencies in
`our_package.cabal`

become again

```
...
build-depends:
base
```

## Final remarks

The mechanism based on the naming convention was integrated recently in LH and can be used from the latest version in the github repository. Because dozens of reexporting modules were eliminated, the time needed to build and test LH has nearly halved.

The implementation passes the test suite of LH, and now it is time
to experiment a bit with larger examples.
For instance, it is plausible that situations will arise where
assumptions need to be modified, and provisions will need to be
taken to inhibit the implicit loading of `_LHAssumptions`

modules or
override some of the imported assumptions. The solution in Python,
which is similar to the one in LH, does anticipate cases like these.

Special thanks to Niki Vazou and Ranjit Jhala for helping me navigate the sometimes tricky design space to arrive at the current solution. Suggestions to improve the management of assumptions are welcome, so please, don’t hesitate to reach out and share your thoughts.