Posted by Jesper on October 4, 2019
I quite often hear from people that they are interested in learning Agda, and that’s great! However, I often feel there are not enough examples out there of how to use the different features of Agda to implement programs, enforce invariants in their types, and prove their properties. So in this blog post I hope to address exactly that problem.
The main goal of this example is to show off the dual purpose of Agda as a strongly typed programming language and as a proof assistant for formalizing mathematics. Since both purposes are part of the same language, each of them reinforces the other: we can prove properties of our programs, and we can write programs that produce proofs. In this example, we will see this power in action by (1) defining the mathematical structure of partial orders, (2) implementing a generic binary search trees and insertion of new elements into them, and (3) proving that this function is implemented correctly.
This blog post was based on a talk I gave at the Initial Types Club at Chalmers.
Preliminaries
For this post we keep the dependencies to a minimum so we don’t rely on the standard library. Instead, we import some of the built-in modules of Agda directly.
open import Agda.Primitive open import Agda.Builtin.Bool open import Agda.Builtin.Nat open import Agda.Builtin.Equality
A variable
declaration
(since Agda 2.6.0) allows us to use variables without binding
them explicitly. This means they are implicitly universally quantified
in the types where they occur.
variable A B C : Set x y z : A k l m n : Nat
In the code that follows, we will use instance
arguments
to automatically construct some proofs. When working with instance
arguments, the it
function below is often very useful. All it does
is ask Agda to please fill in the current argument by using a
definition that is marked as an instance
. (More about instance
arguments later).
it : {{x : A}} → A it {{x}} = x
(Unary) natural numbers are defined as the datatype Nat
with two
constructors zero : Nat
and suc : Nat → Nat
. We use the ones
imported from Agda.Builtin.Nat
because they allow us to write
literal numerals as well as constructor forms.
_ : Nat _ = zero + 7 * (suc 3 - 1)
(Definitions that are named _
are typechecked by Agda but cannot be
used later on. This is often used to define examples or test cases).
We can define parametrized datatypes
and functions by pattern matching
on them. For example, here is the equivalent of Haskell’s Maybe
type.
data Maybe (A : Set) : Set where just : A → Maybe A nothing : Maybe A mapMaybe : (A → B) → (Maybe A → Maybe B) mapMaybe f (just x) = just (f x) mapMaybe f nothing = nothing
Note how A
and B
are implicitly quantified in the type of
mapMaybe
!
Quick recap on the Curry-Howard correspondence
The Curry-Howard correspondence is the core idea that allows us to use Agda as both a programming language and a proof assistant. Under the Curry-Howard correspondence, we can interpret logical propositions (A ∧ B, ¬A, A ⇒ B, …) as the types of all their possible proofs.
A proof of ‘A and B’ is a pair (x , y) of a proof x : A
and an
proof y : B
.
record _×_ (A B : Set) : Set where constructor _,_ field fst : A snd : B open _×_
A proof of ‘A or B’ is either inl x
for a proof x : A
or inr y
for a proof y : B
.
data _⊎_ (A B : Set) : Set where inl : A → A ⊎ B inr : B → A ⊎ B mapInl : (A → B) → A ⊎ C → B ⊎ C mapInl f (inl x) = inl (f x) mapInl f (inr y) = inr y mapInr : (B → C) → A ⊎ B → A ⊎ C mapInr f (inl x) = inl x mapInr f (inr y) = inr (f y)
A proof of ‘A implies B’ is a transformation from proofs x : A
to
proofs of B
, i.e. a function of type A → B
.
‘true’ has exactly one proof tt : ⊤
. We could define this as a
datatype with a single constructor tt
, but here we define it as a
record type instead. This has the advantage that Agda will use
eta-equality for elements of ⊤
, i.e. x = y
for any two variables
x
and y
of type ⊤
.
record ⊤ : Set where constructor tt -- no fields
‘false’ has no proofs.
data ⊥ : Set where -- no constructor
‘not A’ can be defined as ‘A implies false’.
¬_ : Set → Set ¬ A = A → ⊥
Examples
-- “If A then B implies A” ex₁ : A → (B → A) ex₁ = λ z _ → z -- “If A and true then A or false” ex₂ : (A × ⊤) → (A ⊎ ⊥) ex₂ = λ z → inl (fst z) -- “If A implies B and B implies C then A implies C” ex₃ : (A → B) → (B → C) → (A → C) ex₃ = λ f g z → g (f z) -- “It is not the case that not (either A or not A)” ex₄ : ¬ (¬ (A ⊎ (¬ A))) ex₄ = λ f → f (inr (λ x → f (inl x)))
Since Agda’s logic is constructive, it is not possible to prove the
direct version of ex₄
(A ⊎ (¬ A)
).
Equality
To state many properties of our programs, we need the notion of
equality. In Agda, equality is defined as the datatype _≡_
with one
constructor refl : x ≡ x
(imported from Agda.Builtin.Equality
).
_ : x ≡ x _ = refl sym : x ≡ y → y ≡ x sym refl = refl trans : x ≡ y → y ≡ z → x ≡ z trans refl refl = refl cong : (f : A → B) → x ≡ y → f x ≡ f y cong f refl = refl subst : (P : A → Set) → x ≡ y → P x → P y subst P refl p = p
Ordering natural numbers
The standard ordering on natural numbers can be defined as an
indexed datatype
with two indices of type Nat
:
module Nat-≤ where data _≤_ : Nat → Nat → Set where ≤-zero : zero ≤ n ≤-suc : m ≤ n → suc m ≤ suc n ≤-refl : n ≤ n ≤-refl {n = zero} = ≤-zero ≤-refl {n = suc k} = ≤-suc ≤-refl ≤-trans : k ≤ l → l ≤ m → k ≤ m ≤-trans ≤-zero l≤m = ≤-zero ≤-trans (≤-suc k≤l) (≤-suc l≤m) = ≤-suc (≤-trans k≤l l≤m) ≤-antisym : m ≤ n → n ≤ m → m ≡ n ≤-antisym ≤-zero ≤-zero = refl ≤-antisym (≤-suc m≤n) (≤-suc n≤m) = cong suc (≤-antisym m≤n n≤m)
Now we can prove statements like 3 ≤ 5
as follows:
_ : 3 ≤ 5 _ = ≤-suc (≤-suc (≤-suc ≤-zero))
However, to prove an inequality like 9000 ≤ 9001
we would have to
write 9000 ≤-suc
constructors, which would get very
tedious. Instead, we can use Agda’s instance arguments
to automatically construct these kind of proofs.
To do this, we define an ‘instance’ that automatically constructs a
proof of m ≤ n
when m and n are natural number literals. A
definition inst : A
that is marked as an ‘instance’ will be used to
automatically construct the implicit argument to functions with a type
of the form {{x : A}} → B
.
For efficiency reasons, we don’t mark the constructors ≤-zero
and
≤-suc
as instances directly. Instead, we make use of the efficient
boolean comparison _<_
(imported from Agda.Builtin.Nat
) to
construct the instance when the precondition So (m < suc n)
is
satisfied.
So : Bool → Set So false = ⊥ So true = ⊤ instance ≤-dec : {p : So (m < suc n)} → m ≤ n ≤-dec {m = zero} {n = n} = ≤-zero ≤-dec {m = suc m} {n = suc n} {p = p} = ≤-suc (≤-dec {p = p}) _ : 9000 ≤ 9001 _ = it
Partial orders
We’d like to talk not just about orderings on concrete types like
Nat
, but also about the general concept of a ‘partial order’. For
this purpose, we define a typeclass Ord
that contains the type _≤_
and proofs of its properties.
record Ord (A : Set) : Set₁ where field _≤_ : A → A → Set ≤-refl : x ≤ x ≤-trans : x ≤ y → y ≤ z → x ≤ z ≤-antisym : x ≤ y → y ≤ x → x ≡ y _≥_ : A → A → Set x ≥ y = y ≤ x
Unlike in Haskell, typeclasses are not a primitive concept in
Agda. Instead, we use the special syntax open Ord {{...}}
to bring
the fields of the record in scope as instance functions with a type of
the form {A : Set}{{r : Ord A}} → ...
. Instance search will then
kick in to find the right implementation of the typeclass
automatically.
open Ord {{...}}
We now define some concrete instances of the typeclass using copattern matching
instance Ord-Nat : Ord Nat _≤_ {{Ord-Nat}} = Nat-≤._≤_ ≤-refl {{Ord-Nat}} = Nat-≤.≤-refl ≤-trans {{Ord-Nat}} = Nat-≤.≤-trans ≤-antisym {{Ord-Nat}} = Nat-≤.≤-antisym instance Ord-⊤ : Ord ⊤ _≤_ {{Ord-⊤}} = λ _ _ → ⊤ ≤-refl {{Ord-⊤}} = tt ≤-trans {{Ord-⊤}} = λ _ _ → tt ≤-antisym {{Ord-⊤}} = λ _ _ → refl module Example (A : Set) {{A-≤ : Ord A}} where example : {x y z : A} → x ≤ y → y ≤ z → z ≤ x → x ≡ y example x≤y y≤z z≤x = ≤-antisym {A = A} x≤y (≤-trans {A = A} y≤z z≤x)
For working with binary search trees, we need to be able to decide for any two elements which is the bigger one, i.e. we need a total, decidable order.
data Tri {{_ : Ord A}} : A → A → Set where less : {{x≤y : x ≤ y}} → Tri x y equal : {{x≡y : x ≡ y}} → Tri x y greater : {{x≥y : x ≥ y}} → Tri x y record TDO (A : Set) : Set₁ where field {{Ord-A}} : Ord A -- superclass Ord tri : (x y : A) → Tri x y open TDO {{...}} public triNat : (x y : Nat) → Tri x y triNat zero zero = equal triNat zero (suc y) = less triNat (suc x) zero = greater triNat (suc x) (suc y) with triNat x y ... | less {{x≤y}} = less {{x≤y = Nat-≤.≤-suc x≤y}} ... | equal {{x≡y}} = equal {{x≡y = cong suc x≡y}} ... | greater {{x≥y}} = greater {{x≥y = Nat-≤.≤-suc x≥y}} instance TDO-Nat : TDO Nat Ord-A {{TDO-Nat}} = Ord-Nat tri {{TDO-Nat}} = triNat
Binary search trees
In a dependently typed language, we can encode invariants of our data structures by using indexed datatypes. In this example, we will implement binary search trees by a lower and upper bound to the elements they contain (see How to Keep Your Neighbours in Order by Conor McBride).
Since the lower bound may be -∞ and the upper bound may be +∞, we start by providing a generic way to extend a partially ordered set with those two elements.
data [_]∞ (A : Set) : Set where -∞ : [ A ]∞ [_] : A → [ A ]∞ +∞ : [ A ]∞ variable lower upper : [ A ]∞ module Ord-[]∞ {A : Set} {{ A-≤ : Ord A}} where data _≤∞_ : [ A ]∞ → [ A ]∞ → Set where -∞-≤ : -∞ ≤∞ y []-≤ : x ≤ y → [ x ] ≤∞ [ y ] +∞-≤ : x ≤∞ +∞ []∞-refl : x ≤∞ x []∞-refl { -∞} = -∞-≤ []∞-refl {[ x ]} = []-≤ (≤-refl {A = A}) []∞-refl { +∞} = +∞-≤ []∞-trans : x ≤∞ y → y ≤∞ z → x ≤∞ z []∞-trans -∞-≤ _ = -∞-≤ []∞-trans ([]-≤ x≤y) ([]-≤ y≤z) = []-≤ (≤-trans {A = A} x≤y y≤z) []∞-trans _ +∞-≤ = +∞-≤ []∞-antisym : x ≤∞ y → y ≤∞ x → x ≡ y []∞-antisym -∞-≤ -∞-≤ = refl []∞-antisym ([]-≤ x≤y) ([]-≤ y≤x) = cong [_] (≤-antisym x≤y y≤x) []∞-antisym +∞-≤ +∞-≤ = refl instance Ord-[]∞ : {{_ : Ord A}} → Ord [ A ]∞ _≤_ {{Ord-[]∞}} = _≤∞_ ≤-refl {{Ord-[]∞}} = []∞-refl ≤-trans {{Ord-[]∞}} = []∞-trans ≤-antisym {{Ord-[]∞}} = []∞-antisym open Ord-[]∞ public
We define some instances to automatically construct inequality proofs.
module _ {{_ : Ord A}} where instance -∞-≤-I : {y : [ A ]∞} → -∞ ≤ y -∞-≤-I = -∞-≤ +∞-≤-I : {x : [ A ]∞} → x ≤ +∞ +∞-≤-I = +∞-≤ []-≤-I : {x y : A} {{x≤y : x ≤ y}} → [ x ] ≤ [ y ] []-≤-I {{x≤y = x≤y}} = []-≤ x≤y
Now we are (finally) ready to define binary search trees.
data BST (A : Set) {{_ : Ord A}} (lower upper : [ A ]∞) : Set where leaf : {{l≤u : lower ≤ upper}} → BST A lower upper node : (x : A) → BST A lower [ x ] → BST A [ x ] upper → BST A lower upper _ : BST Nat -∞ +∞ _ = node 42 (node 6 leaf leaf) (node 9000 leaf leaf)
Note how instances help by automatically filling in the proofs that the bounds are satisfied! Somewhat more explicitly, the tree looks as follows:
_ : BST Nat -∞ +∞ _ = node 42 (node 6 (leaf {{l≤u = -∞≤6}}) (leaf {{l≤u = 6≤42}})) (node 9000 (leaf {{l≤u = 42≤9000}}) (leaf {{l≤u = 9000≤+∞}})) where -∞≤6 : -∞ ≤ [ 6 ] -∞≤6 = it 6≤42 : [ 6 ] ≤ [ 42 ] 6≤42 = it 42≤9000 : [ 42 ] ≤ [ 9000 ] 42≤9000 = it 9000≤+∞ : [ 9000 ] ≤ +∞ 9000≤+∞ = it
Next up: defining a lookup function. The result of this function is
not just a boolean true/false, but a proof that the element is
indeed in the tree. A proof that x
is in the tree t
is either a
proof that it is here
, a proof that it is in the left
subtree, or
a proof that it is in the right
subtree.
module Lookup {{_ : TDO A}} where data _∈_ {lower} {upper} (x : A) : (t : BST A lower upper) → Set where here : ∀ {t₁ t₂} → x ≡ y → x ∈ node y t₁ t₂ left : ∀ {t₁ t₂} → x ∈ t₁ → x ∈ node y t₁ t₂ right : ∀ {t₁ t₂} → x ∈ t₂ → x ∈ node y t₁ t₂
The definition of lookup
makes use of
with
-abstraction
to inspect the result of the tri
function on x
and y
.
lookup : ∀ {lower} {upper} → (x : A) (t : BST A lower upper) → Maybe (x ∈ t) lookup x leaf = nothing lookup x (node y t₁ t₂) with tri x y ... | less = mapMaybe left (lookup x t₁) ... | equal = just (here it) ... | greater = mapMaybe right (lookup x t₂)
Similarly, we can define an insertion function. Here, we need to enforce the precondition that the element we want to insert is between the bounds (alternatively, we could have updated the bounds in the return type to ensure they include the inserted element).
module Insert {{_ : TDO A}} where insert : (x : A) (t : BST A lower upper) → {{l≤x : lower ≤ [ x ]}} {{x≤u : [ x ] ≤ upper}} → BST A lower upper insert x leaf = node x leaf leaf insert x (node y t₁ t₂) with tri x y ... | less = node y (insert x t₁) t₂ ... | equal = node y t₁ t₂ ... | greater = node y t₁ (insert x t₂)
To prove correctness of insertion, we have to show that y ∈ insert x t
is equivalent to x ≡ y ⊎ y ∈ t
. The proofs insert-sound₂
and
insert-complete
are a bit long because there are two elements x
and y
that can both independently be here
, in the left subtree, or
in the right subtree, so we have to distinguish 9 distinct cases. Let
me know if you manage to find a shorter proof!
open Lookup insert-sound : (x : A) (t : BST A lower upper) → {{_ : lower ≤ [ x ]}} {{_ : [ x ] ≤ upper}} → (x ≡ y) ⊎ (y ∈ t) → y ∈ insert x t insert-sound x t (inl refl) = insert-sound₁ x t where insert-sound₁ : (x : A) (t : BST A lower upper) → {{_ : lower ≤ [ x ]}} {{_ : [ x ] ≤ upper}} → x ∈ insert x t insert-sound₁ x leaf = here refl insert-sound₁ x (node y t₁ t₂) with tri x y insert-sound₁ x (node y t₁ t₂) | less = left (insert-sound₁ x t₁) insert-sound₁ x (node y t₁ t₂) | equal = here it insert-sound₁ x (node y t₁ t₂) | greater = right (insert-sound₁ x t₂) insert-sound x t (inr y∈t) = insert-sound₂ x t y∈t where insert-sound₂ : (x : A) (t : BST A lower upper) → {{_ : lower ≤ [ x ]}} {{_ : [ x ] ≤ upper}} → y ∈ t → y ∈ insert x t insert-sound₂ x (node y t₁ t₂) (here refl) with tri x y ... | less = here refl ... | equal = here refl ... | greater = here refl insert-sound₂ x (node y t₁ t₂) (left y∈t₁) with tri x y ... | less = left (insert-sound₂ x t₁ y∈t₁) ... | equal = left y∈t₁ ... | greater = left y∈t₁ insert-sound₂ x (node y t₁ t₂) (right y∈t₂) with tri x y ... | less = right y∈t₂ ... | equal = right y∈t₂ ... | greater = right (insert-sound₂ x t₂ y∈t₂) insert-complete : (x : A) (t : BST A lower upper) → {{_ : lower ≤ [ x ]}} {{_ : [ x ] ≤ upper}} → y ∈ insert x t → (x ≡ y) ⊎ (y ∈ t) insert-complete x leaf (here refl) = inl refl insert-complete x (node y t₁ t₂) y∈t' with tri x y insert-complete x (node y t₁ t₂) (here refl) | less = inr (here refl) insert-complete x (node y t₁ t₂) (here refl) | equal = inl it insert-complete x (node y t₁ t₂) (here refl) | greater = inr (here refl) insert-complete x (node y t₁ t₂) (left y∈t₁') | less = mapInr left (insert-complete x t₁ y∈t₁') insert-complete x (node y t₁ t₂) (left y∈t₁) | equal = inr (left y∈t₁) insert-complete x (node y t₁ t₂) (left y∈t₁) | greater = inr (left y∈t₁) insert-complete x (node y t₁ t₂) (right y∈t₂) | less = inr (right y∈t₂) insert-complete x (node y t₁ t₂) (right y∈t₂) | equal = inr (right y∈t₂) insert-complete x (node y t₁ t₂) (right y∈t₂') | greater = mapInr right (insert-complete x t₂ y∈t₂')
Of course, there are many more functions on search trees we could want to implement and prove correct: deletion, merging, flattening … Likewise, there are other invariants we might want to enforce in the type, such as being well-balanced. I strongly recommend to read Conor McBride’s paper on the topic, or try it yourself!