Machine Learning–Assisted Lemma Recommendation for UniMath
Master Project
1. Introduction
Interactive theorem provers such as Rocq enable the development of fully verified mathematical libraries and formally certified proofs. One of the largest formal mathematics libraries based on univalent foundations is UniMath, which contains extensive developments in category theory, algebra, type theory, and related areas.
A central challenge in large proof libraries is proof navigation: users frequently spend substantial time searching for relevant lemmas, definitions, and proof patterns. Modern machine learning methods have recently shown promising results in theorem proving environments by assisting users with tactic prediction, proof search, and semantic retrieval. However, most existing systems focus on generic Coq or Lean environments, while UniMath remains comparatively unexplored.
This project investigates whether machine learning techniques can provide effective lemma recommendation for UniMath proofs. The goal is to develop a prototype system that predicts useful lemmas from the current proof state and evaluates its usefulness for interactive proof development.
2. Objectives
The primary objective is to design and evaluate a machine learning–based lemma recommendation system for UniMath.
The project aims to:
- Extract structured proof data from the UniMath library.
- Construct a dataset linking proof goals to subsequently used lemmas.
- Develop representations of proof states suitable for machine learning.
- Train and evaluate a recommendation model.
- Analyze the structural properties of UniMath proofs and their impact on retrieval quality.
Optional objectives include:
- semantic theorem search,
- IDE integration,
- embedding visualization,
- or comparison with generic retrieval baselines.
3. Research Questions
Possible research questions include:
-
Can machine learning models accurately predict useful lemmas in UniMath proofs?
-
Do the structural and categorical regularities of UniMath improve retrieval performance compared to generic Coq corpora?
-
Which representations of proof states are most effective for lemma recommendation?
- token-based,
- syntax-tree–based,
- or embedding-based?
-
How useful is semantic retrieval compared to traditional keyword search?
4. Scope of the Project
The project focuses on lemma recommendation rather than full proof automation.
Given:
- a proof goal,
- local hypotheses,
- and optionally previous proof steps,
the system should rank potentially useful UniMath lemmas.
Example:
```coq id=”e9m8bh” Goal: forall f : A -> B, isweq f -> …
Possible recommendations:
```coq id="2g0m91"
isweq_iso
funextsec
subtypePath
The project does not attempt:
- autonomous theorem proving,
- complete proof synthesis,
- or large-scale language model training.
5. Expected Contributions
The project is expected to contribute:
A machine learning dataset derived from UniMath proofs. A prototype lemma recommendation system for UniMath. An empirical evaluation of retrieval methods for formal mathematics. Insights into the structure of proofs in univalent foundations.
Potentially reusable software infrastructure may also be produced.
5. Related Work
Relevant prior systems include:
These systems study machine learning for theorem proving, but little work currently focuses specifically on UniMath and univalent foundations.
Useful references:
Contacts for the Project
- Benedikt Ahrens (TU Delft, Software Technology)
Supervisor(s): Benedikt Ahrens
Posted: May 12, 2026