Verifying five Haskell libraries with Agda2Hs

Talk at AIM XXXIV

Jesper Cockx

9 June 2021

Introduction

Motivation

With Agda, we can write verified functional programs (hooray!)

However, not everyone uses Agda (booh!)

How can we give the benefits of verified code to those poor people??

Enter Agda2Hs

Agda2Hs is an Agda backend that maps a subset of Agda to readable Haskell code.

How to use Agda2Hs:

  1. Write library in Agda
  2. Verify library in Agda
  3. Translate verified code to Haskell
  4. ???
  5. Profit

Our research question

How feasible is it to verify Haskell libraries using Agda2Hs?

Sub-questions: how hard is it to…

  1. port Haskell code to Agda?
  2. state the properties and invariants?
  3. actually prove them?

Research project at TU Delft

3rd year Bsc students work on research project for 10 weeks in groups of 5.

Each student works individually on a sub-question

In this case: Each student works on porting and verifying one existing Haskell library with Agda2Hs.

Students had basic knowledge of Haskell + Agda from FP course, but no other previous experience.

The five libraries

  1. Data.Map: maps implemented as binary search trees
  2. Data.Sequence: lists implemented as finger trees
  3. Data.Graph: algebraic graphs implemented as patricia trees
  4. QuadTree: regions of 2D space
  5. RangedSet: unions of continuous ranges of numbers

Where to get your properties?

There are plenty of sources for properties to prove:

  • Preconditions for making partial functions total
  • Pre- and post-conditions stated in comments
  • Laws of type classes that are implemented
  • Internal invariants documented in Haskell source
  • QuickCheck properties in the source

Some highlights

Data.Map library

Large and commonly used library, so it has lots of functions and low-level optimizations.

Sequence library

Finger trees use heterogeneous recursion: The type of elements changes in each step.

Algebraic graph library

The agl library defines both an abstract interface for working with algebraic graphs, and an implementation of this interface using Patricia trees.

QuadTree

This library relies heavily on lenses: functional getter/setter pairs which should satisfy a bunch of laws.

Ranged sets

Many functions on ranged sets do not actually preserve all invariants, so the result needs to be normalised first.

Lessons to be learned

Common issue #1: Boolean blindness

  • In the FP course, I introduced the predicate IsTrue : Bool -> Set.
  • This looks like a nice thing since you can reuse boolean functions.
  • … but it’s actually a bit of a trap: you know that a boolean is true but you forget why it is true.

Common issue #2: sticking too close to the Haskell way

  • It is sometimes very tricky to preserve the same type signature.
  • Especially for partial functions preserving the type signature is not possible or not desirable.
  • Dependent types sometimes actually drive us towards more elegant type signatures!

Common issue #3: lack of abstraction in proofs

  • Proofs by the students are often very long, and contain a lot of repeated expressions and applications of cong.
  • It would be better to identify simple abstract properties and use those to simplify the proof.
  • But this probably requires some experience to identify good auxiliary properties.

Conclusion

What can we conclude so far?

  • Project is still ongoing for four more weeks, but most implementation work is done
  • Students impressed me with their productivity
  • ‘Simple’ techniques often do not scale
  • Standard library is difficult to navigate
  • Need for ‘design patterns’ for verifying code