Jesper Cockx - The Agda's New Sorts

Posted by Jesper on May 3, 2018

In the last few weeks, Sandro Stucki has given a couple of excellent presentations on pure type systems (pts’s) at the initial types club at Chalmers. Since I’ve been working on the implementation of Agda’s sorts system, and the new implementation is closely based on the theory of pure type systems, I thought it would be interesting to talk a bit about the implementation of these concepts used by Agda.

Prerequisites: a bit of experience with using Agda, basic knowledge of pure type systems (see section 5.2 of Barendregt’s classic).

Note: this post started as my personal notes to prepare for a talk at the initial types club, but then I realized it could make a nice blog post. Since this is kind of an experiment, please let me know if there’s something I can improve or if you’re just eager to have more posts about the development of Agda!

Agda’s universe hierarchy

Sorts (aka universes) are types whose members themselves are again types. They are a central element of Martin-Löf’s type theory and languages based on that theory, such as Agda. The prototypical example of a sort in Agda is Set, the universe of small types. Why do we need any sort other than Set, you ask? Well, to avoid inconsistency and undecidable typechecking we cannot have Set : Set (known as the type-in-type rule): Martin-Löf’s original type theory had a rule like this, but Girard showed that this is inconsistent (this is called Girard’s paradox). While Girard’s original paradox is quite complex, it’s easier to get a contradiction in Agda by (ab)using datatypes and structural recursion (example taken from a post by Andreas Abel on the Agda mailing list):

{-# OPTIONS --type-in-type #-}

data ⊥ : Set where

data D : Set where
  c : (f : (A : Set) → A → A) → D

empty : D → ⊥
empty (c f) = empty (f D (c f))

absurd : ⊥
absurd = empty (c λ A x → x)

See also this post by Liam O’Connor for a direct encoding of a similar inconsistency with type-in-type, Russell’s paradox (aka the Barber’s paradox).

To avoid these inconsistencies, Agda has an infinite hierarchy of sorts Set0 (= Set), Set1, Set2, … such that Set0 : Set1, Set1 : Set2, … Thus if we restrict Agda’s type system to just the sorts SetN and dependent function types (x : A) → B, we can think of it as a pure type system with the sorts given by natural numbers, axioms (n, n+1) and rules (m, n, m ⊔ n) (where ⊔ indicates the maximum). Unlike Coq, Agda has no cumulativity, so we have Set0 : Set1 and Set1 : Set2 but not Set0 : Set2. I’ll come back to the point of cumulativity at the end of this post.

Universe polymorphism

In practice, of course, things aren’t that simple. It turns out that defining the same functions and datatypes at different universe levels gets really old really quickly. To fix this, Agda has a feature called universe polymorphism. This feature exposes a new type Level : Set to the user, as well as primitive functions on levels lzero : Level, lsuc : Level → Level, and _⊔_ : Level → Level → Level. For each l : Level we get a new universe Set l, with the obvious rules (Set lzero = Set0, Set (lsuc lzero) = Set1, …) This allows us to define functions and datatypes for all universe levels at once, e.g. the polymorphic identity function:

id : (l : Level)(A : Set l)(x : A) → A
id l A x = x

This seems like a useful yet unexciting feature and indeed, if you’ve ever seen some non-trivial Agda code you have probably encountered it already. However, it has some big implications for Agda as a pure type system: it means that the sort of the codomain of a function type can now also depend on the value of the argument. This definitely does not fit into the framework of pure type systems anymore, so we’re entering uncharted lands. It also raises a number of practical questions about Agda’s sort system, such as:

Levels are now no longer just natural numbers but they can be arbitrary terms, in particular a level can be neutral (i.e. contain free variables). So how do we determine when two levels are equal?
The type of id is (l : Level) (A : Set l) (x : A) → A, but what is the sort of this type?

To answer the former question, Agda has a special-purpose solver to solve equalities and inequalities between arbitrary expressions of type Level, but I won’t say more about it here. The second question is what interests us here: since the sort of this type cannot be Set l for any level l, Agda introduces a new sort Setω which is different from any Set l, and it tells us that (l : Level) (A : Set l) (x : A) → A : Setω. So Setω can be thought of as a sort that’s bigger than Set l for any level l.

To extend the theory of pure type systems to universe polymorphism, we now need to consider ‘dependent pts rules’ of the form (x : _ : s1, s2, s3) where s2 can depend on the variable x. We do not care about the type of x, but only about the fact that this type should be in s1. Concretely, the new dependent pts rule for Setω is (x : _ : s , s' , Setω) whenever s' is dependent on x. On the other hand, when s' is not dependent on x we fall back to the regular ‘non-dependent’ pts rules.

Beyond the universe hierarchy

By introducing universe polymorphism, we also have a first example of a sort which is not in the universe hierarchy of Set l: Setω. It turns out that considering sorts different from any Set l can actually be very useful: we can encode some additional properties of a type in its sort, or we can restrict what the user is allowed to do with types of a certain sort by restricting the pts rules. Here are a few examples which I think would be very nice to have in Agda:

We could add a Prop sort to Agda which consists of definitionally proof-irrelevant types (in contrast to Prop in Coq, which is at best propositionally proof-irrelevant). See PropRezz.agda for a quick demo of the possibilities.
Sized types make use of a type Size to prove termination in a modular way, but there are some operations that are not permitted on function types ending in Size. These properties can be enforced by giving Size its own universe SizeUniv.
We could have a new option --omega-in-omega which makes Agda inconsistent by adding the axiom Setω : Setω, similar to --type-in-type but without collapsing the whole universe hierarchy. This should make it easier to do some ‘unsafe’ things which require --type-in-type without destroying compatibility with standard universe-polymorphic Agda code.
Similarly to Prop, we could also add a universe SSet of strict sets, i.e. types for which the identity type is a Prop. In particular, these types should satisfy the definitional K rule stating that any proof of x ≡ x is definitionally equal to refl.
Finally (and most ambitiously) it would be very interesting to add a sort of non-fibrant types to make Agda into a two-level type theory like Voevodsky’s Homotopy Type System (HTS). The basic idea is that the non-fibrant types serve as an internalization of the metatheory of the fibrant types, which allows us to do all kinds of crazy meta-level things in the language itself. In particular, the first level may satisfy univalence and the second level may satisfy UIP; by separating the sorts we can make sure that this doesn’t lead to inconsistencies. You can read more about two-level type systems in Formalisations Using Two-Level Type Theory by Danil Annenkov, Paolo Capriotti, and Nicolai Kraus.

Making room in Agda’s architecture for all of these new sorts was the main motivation for me to start working on a new implementation of Agda’s sort system.

Meta levels and meta sorts

When typechecking an Agda program, we frequently have to check that some expression is a valid type without knowing what sort it has. Up until recently, Agda always assumed that any unknown sort was of the form Set _l for some metavariable _l : Level. You may have noticed that Setω is already not of this form, so there used to be some hacks to make this assumption work anyway (specifically, the metavariable _l was instantiated with the ill-typed solution Setω and there was a computation rule Set Setω ---> Setω. It was horrible.)

To get rid of this assumption that any sort is of the form Set _, I added a new constructor to Agda’s internal representation of sorts for a sort metavariable or sort meta for short, representing an as of yet unknown sort. These sort metas can then be solved by the constraint solver just like regular (term) metas. Nice and tidy.

However, there’s a complication: we often have to determine the sort of a function type with domain in sort s1 and codomain in sort s2, when s1 and/or s2 are still unknown. For example, what’s the sort of (l : Level) → Set (_1 l) where _1 is a metavariable of type Level → Level? It could be e.g. Setω if _1 is the identity function, or Set if _1 is the constant function lzero. Similarly, we may want to get the sort ‘one level up’ from a given sort s, but s is still unknown. Introducing more new sort metas in those cases is not ideal, as we actually use those operations a lot, so we would get insane numbers of metavariables. Also, it would be quite tricky to keep track of all the dependencies between the different sort metas.

Instead, I opted to introduce two extra constructors PiSort s1 s2 and UnivSort s for the sort of a function type and the sort of another sort respectively (for the connoisseurs: PiSort is similar to the old DLub sort). These constructors do not represent new sorts but instead they compute to the right sort once their arguments are known. For example, PiSort Level (\l. Set l) evaluates to Setω, PiSort (Set l) (\_. Set l') evaluates to Set (l ⊔ l') and UnivSort (Set l) evaluates to Set (lsuc l). Not every PiSort or UnivSort is well-defined, for example Setω does not have a UnivSort since there is no bigger sort than Setω. So the PiSort and UnivSort constructors are accompanied by two new constraints HasBiggerSort s and HasPTSRule s1 s2. Constraints are Agda’s way of handling postponement of certain problems, such as the ValueCmp constraint which enforces two terms are equal. These two new constraints in particular ensure that no occurrence of PiSort or UnivSort will be stuck forever but each will compute to a proper sort eventually (or else Agda will report ‘unsolved constraints’).

So that’s it: the design of Agda’s new sort system. It has already been merged into the main Agda repository, so if you are still hungry for more details you can take a look there. Specifically:

Agda.Syntax.Internal: the Sort datatype used for the internal representation of sorts
Agda.TypeChecking.Substitute: basic functions defining the axioms and rules of Agda’s pts
Agda.TypeChecking.Conversion: solving equalities and inequalities between sorts
Agda.TypeChecking.Monad.Base: two new constraints HasBiggerSort and HasPTSRule were added to the Constraint datatype
Agda.TypeChecking.Sort: functions for solving these constraints

At the moment I’m working on implementing Prop for Agda and hopefully more new sorts will follow after that, so stay tuned!

Towards cumulativity for Agda?

One of the most requested features for Agda is cumulativity, i.e. the subtyping rule Set i <: Set j for i < j, so it would be amiss to talk about sorts in Agda without mentioning it. And indeed, there are many cases where cumulativity would make writing Agda code a lot easier and less frustrating! However, as far as I can tell no-one has really figured out how to combine Agda-style universe polymorphism with cumulativity, so it is still an open problem.

The closest working example is Coq, which sports both a cumulative hierarchy of universes and universe polymorphism (see this paper by Matthieu Sozeau and Nicolas Tabareau). But there are some limitations compared to what we would want to have in Agda:

Coq only allows top-level definitions to be quantified over levels, and thus it doesn’t require a sort like Setω. But in Agda we can quantify over universe-polymorphic types which are in Setω! See this code by Nils Anders Danielsson for an example where this is actually used.
Coq doesn’t guarantee the uniqueness of metavariable solutions, but this uniqueness is a core design principle of Agda. So we expect that simply adding cumulativity to Agda will result in lots and lots of unsolved metavariables, unless we have some way to mitigate this.

While my refactoring of the sort system certainly does not enable cumulativity directly, it allows for some new possibilities to experiment with. I’m planning to do some of this experimentation during the next Agda meeting in Göteborg 4 – 9 June 2018. So if you’re interested to work on this (or on any other part of Agda) you are very welcome to come and join us!