---
title: "Exponential blowup when using unbundled typeclasses to model algebraic hierarchies"
categories: research coq
forum: https://coq.discourse.group/t/blog-post-exponential-blowup-when-using-unbundled-typeclasses-to-model-algebraic-hierarchies/289
---

When formalizing a proof in an interactive theorem prover like [Coq](https://coq.inria.fr/), one reoccurring issue is the handling of algebraic hierarchies.
Such hierarchies are everywhere:  some operations are associative, while others commute; some types have an equivalence relation, while others also have a (pre-)order or maybe even a well-ordering; and so on.
So the question arises:  What is the best way to actually encode these hierarchies in Coq?
Coq offers two mechanisms that are suited to solve this task: typeclasses and canonical structures.
Both can be instrumented in different ways to obtain a (more or less) convenient-to-use algebraic hierarchy.
A common approach using typeclasses is the ["unbundled" approach by Bas Spitters and Eelis van der Weegen](http://www.eelis.net/research/math-classes/mscs.pdf).
However as we learned the hard way in the Coq formalization of the [original Iris paper](https://iris-project.org/pdfs/2015-popl-iris1-final.pdf), this approach quickly leads to terms that seem to be exponential in size.
In this post, I will explain the cause of this exponential blowup.

<!-- MORE -->

I should note that this observation is not new, it already occurs in [François Garillot's PhD thesis](https://pastel.archives-ouvertes.fr/pastel-00649586), a [paper by the same author](https://hal.inria.fr/inria-00368403v2/document) and even the authors of the "unbundled" paper already note having performance problems when scaling up.
The goal of this post is to provide a more self-contained presentation, not requiring all the context set up in that thesis and going into more details than the brief remarks the two papers.

## Unbundled groups

A formalization of the algebraic hierarchy for groups using the unbundled approach could look as follows:
{% highlight coq %}
Require Import Lia ZArith.

(** The operations as typeclasses for overloading, without any axioms. *)
Class Op (A: Type) := op : A -> A -> A.
Infix "+" := op.

Class Unit (A: Type) := unit : A.

Class Inverse (A: Type) := inv : A -> A.

(** The algebraic classes adding the axioms, taking the operations as indices. *)
Class Semigroup (A: Type) `{Op A} := {
  assoc a b c : (a + b) + c = a + (b + c);
}.

Class Monoid (A: Type) `{Semigroup A, Unit A} := {
  id_r a : a + unit = a;
  id_l a : unit + a = a;
}.

Class Group (A: Type) `{Monoid A, Inverse A} := {
  inv_r a : a + inv a = unit;
  inv_l a : inv a + a = unit;
}.
{% endhighlight %}

As usual for the unbundled approach, we end up with lots of tiny typeclasses.
`Op`, `Unit` and `Inverse` serve as operators that can be overloaded.
They each just take a type `A` as argument (we say they are *indexed* by the type: we can do typeclass search that, given the type, finds the instance).
`Semigroup`, `Monoid` and `Group` supply the laws for these operators; they are indexed by the operators they involve and the typeclasses from which they "inherit".
Taking the superclasses as indices instead of embedding them as fields simplifies typeclass search and is necessary to support diamonds in the hierarchy; see [the original paper](http://www.eelis.net/research/math-classes/mscs.pdf) for more details.

We use Coq's feature for anonymous implicit generalization: `` `{Semigroup A} `` adds not only a parameter of type `Semigroup A` that we don't have to name; furthermore all typeclass arguments of `Semigroup` (in this case, `Op A`) are also implicitly added as parameter.
This avoids lots of repetition that the unbundled style would usually incur, [but the semantics of implicit generalization are quirky and poorly documented](https://coq.discourse.group/t/documentation-of-foo/129).

Next, we can show that the integers `Z` form a group:
{% highlight coq %}
Instance Z_op: Op Z := Z.add.
Instance Z_unit: Unit Z := 0%Z.
Instance Z_inv: Inverse Z := Z.opp.

Instance Z_semigroup: Semigroup Z.
Proof. split. intros. unfold op, Z_op. lia. Qed.
Instance Z_monoid: Monoid Z.
Proof. split; intros; unfold op, Z_op, unit, Z_unit; lia. Qed.
Instance Z_group: Group Z.
Proof. split; intros; unfold op, Z_op, unit, Z_unit, inv, Z_inv; lia. Qed.
{% endhighlight %}
We let `lia` do most of the proving, but we have to unfold some definitions first.

We can also show that a product of two {semigroups, monoids, groups} is itself a {semigroup, monoid, group}:
{% highlight coq %}
Section prod.
  Context {A B: Type}.

  Global Instance prod_op `{Op A, Op B}: Op (A * B) :=
    fun '(a1, b1) '(a2, b2) => (a1 + a2, b1 + b2).
  Global Instance prod_unit `{Unit A, Unit B}: Unit (A * B) :=
    (unit, unit).
  Global Instance prod_inv `{Inverse A, Inverse B}: Inverse (A * B) :=
    fun '(a, b) => (inv a, inv b).

  Global Instance prod_semigroup `{Semigroup A, Semigroup B}: Semigroup (A * B).
  Proof.
    split. intros [a1 b1] [a2 b2] [a3 b3]. unfold op, prod_op.
    rewrite !assoc. reflexivity.
  Qed.
  Global Instance prod_monoid `{Monoid A, Monoid B}: Monoid (A * B).
  Proof.
    split; intros [a b]; unfold op, prod_op, unit, prod_unit;
      rewrite ?id_l, ?id_r; reflexivity.
  Qed.
  Global Instance prod_group `{Group A, Group B}: Group (A * B).
  Proof.
    split; intros [a b]; unfold op, prod_op, unit, prod_unit, inv, prod_inv;
      rewrite ?inv_l, ?inv_r; reflexivity.
  Qed.
End prod.
{% endhighlight %}

## Unexpected complexities

Seems reasonable?
Unfortunately, we have set ourselves up for some unexpectedly large terms:
the size of an instance of something like `Group (Z*Z*Z*Z*Z*Z)` grows with the fifth power as the number of `Z` increase!

{% highlight coq %}
Definition test: Group (Z*Z*Z*Z*Z*Z) := _.
Set Printing All.
Print test.
(* Lots of output. *)
{% endhighlight %}

What is happening here?
We can see that when we try to construct the proof terms more explicitly.
We start by defining `Zn`, which is basically `Z^(n+1)`: a product of `n+1` times `Z`.
{% highlight coq %}
Fixpoint Zn (n: nat): Type :=
  match n with
  | O => Z
  | S n => @prod (Zn n) Z
  end.
{% endhighlight %}
Unsurprisingly, this term has size linear in `n`: the recursive case adds `@prod • Z` around the term produced for `n-1` (represented by the `•`).
This means the term grows by a constant when `n` increases by 1, leading to linear complexity overall.

Next, we define `Op (Zn n)`:
{% highlight coq %}
Fixpoint Zn_op (n: nat): Op (Zn n) :=
  match n with
  | O => Z_op
  | S n => @prod_op (Zn n) Z (Zn_op n) Z_op
  end.
{% endhighlight %}
In the recursive case, we use `prod_op` as one might expect.
Coq could infer all the arguments here, but let us take a closer look at what it would infer: what is the size of `Zn_op` as `n` grows?
In the recursive case, we add `@prod_op (Zn n) Z • Z_op` around the term produced for `n-1`.
This term includes `Zn n`, which we established is linear in size.
When `n` increases by 1, the term grows by a linear amount.
Thus, the overall term size is quadratic in `n`!
Ouch.
I don't know about you, but I found this surprising when I realized it for the first time.

The index basically repeats all the information that is already present in the recursive call:
to extend a `Zn_op` by one more element, we have to repeat the full type for which we already constructed an `Op`.

But, you might say, quadratic complexity is not so bad.
Computers are fast, we can handle that.
Unfortunately, I got bad news: things get worse as our indices start nesting.
Let us look at the instance for `Semigroup (Zn n)`:
{% highlight coq %}
Fixpoint Zn_semigroup (n: nat): Semigroup (Zn n) :=
  match n return @Semigroup (Zn n) (Zn_op n) with
  | O => Z_semigroup
  | S n => @prod_semigroup (Zn n) Z (Zn_op n) (Zn_semigroup n) Z_op Z_semigroup
  end.
{% endhighlight %}
How big is this term?
Well, the term added as `n` increases by 1 is `@prod_semigroup (Zn n) Z (Zn_op n) • Z_op Z_semigroup`.
This includes `Zn_op n`, which is quadratic -- so `Zn_semigroup` is cubic!

`Zn_semigroup` has an `Op (Zn n)` as index, and again the full nested term describing the operator of both semigroups that we take the product of is getting repeated each time we add just one more `Z`.
So, while `Op (Zn n)` has an index that makes it grow quadratically, `Semigroup (Zn n)` *has an index that has an index*, and grows cubically.

In general, to compute the complexity of our terms, we have to figure out *how nested* our indices are: with no indices (such as `Zn n` itself), the term is linear; for each extra nested index the power increases by 1.[^1]
If we look at `Group`, we see that it has a `Monoid` index which has a `Semigroup` index which has an `Op` index which has a `Type` index.
That's 4 nested indices, and hence the terms have size `O(n^5)`.
More generally speaking, the term size is *exponential* in the height of our hierarchy.
That hurts.

[^1]: Note that the *number* of indices does not matter, like the fact that `Monoid` has both an `Op` and a `Unit` index. That just affects the constant factor. It's the nesting of indices that gets us here.

## Conclusion

We have seen that the number of nested indices of a typeclass has exponential impact on the size of proof terms that arise when deriving typeclass instances for compound types.
This effect has grinded the original Iris coq development back in 2015 to a halt: compiling the correctness proof of the logic itself took around 3.5 hours and needed more than the 4GB of RAM than my laptop had back then.

In my view, this basically rules out the unbundled approach for any serious formalization effort.
But what are the alternatives?

Instead of making superclasses indices, we could make them fields, as in:
{% highlight coq %}
Class Monoid (A: Type) `{Op A, Unit A} := {
  monoid_semigroup :> Semigroup A;
  id_r a : a + unit = a;
  id_l a : unit + a = a;
}.
{% endhighlight %}

This limits the "index depth" to 2, and thus limits term complexity to `O(n^3)`.
Still not great, but at least it doesn't grow further as we add more subclasses.
However, this scheme does not support diamonds in the hierarchy as instances will be duplicated.
Also, the instance we implicitly added by writing `:>`, namely that `Monoid A -> Semigroup A`, means we'll check all `Monoid` instances any time we just want a `Semigroup`.
This leads to exponential complexity when performing backtracking typeclass search: a `Semigroup (A*A)` can be obtained either by searching for a `Monoid (A*A)` or via `prod_semigroup`, so in case of backtracking Coq will go through a combinatorial explosion when trying all possible ways to derive `Semigroup (A*A*A*...)`.[^2]

[^2]: In genaral, such combinatorial blowups happen very easily with backtracking, and they are hard to detect in advance. This is one reason why I believe that backtracking-per-default is a bad choice for proof search.

Maybe there is a way to make the unbundled approach viable with some help from the proof assistant.
Some way to eliminate indices would help (similar to primitive projections, but also covering operations such as `prod_op`).
Or maybe the proof assistant could exploit the huge amounts of sharing that are present in these large terms.
If all subterms are properly shared, I think the overall term size remains linear.
Coq deduplicates term representation in memory, but (to my knowledge) does not exploit sharing when traversing the term (for unification or type-checking or any other purpose).
Defeating the exponential complexity here would require exploiting the sharing in *every* pass over the term.
An alternative to opportunistically discovering and exploiting sharing might be to [bake it more deeply into metavariable inference](https://eutypes.cs.ru.nl/eutypes_pmwiki/uploads/Meetings/Kovacs_slides.pdf).

In Iris, we instead went with an adaptation of [this approach based on canonical structures](https://hal.inria.fr/hal-00816703v1/document).
This fully bundled approach avoids indices entirely and thus maintains a linear term size.
On the other hand, it comes with a lot of boilerplate code, and mixing canonical structures with typeclasses can be tricky to debug. (Unfortunately we cannot avoid typeclasses entirely as we need to have `Proper` instances in our algebraic classes.)
When composing types, Iris users have to write things like `prodC (listC natC) (optionC fracC)` instead of `(list nat) * (option frac)`, which is an additional barrier of entry to an already complicated development.
Also, I have to say I don't really have an intuitive grasp of canonical structure resolution, so when something does not work I don't even know how to start debugging.
On the other hand, the correctness proof of the logic *and* the general proof mode infrastructure *and* a few examples verified in the logic take less than five minutes to compile in our latest Coq formalization.
That's quite some speedup!

All in all, I think the question of how to best represent algebraic hierarchies in a proof assistant is still wide open.
I am curious which ideas people will come up with!

#### Footnotes