Back to homepage

How We Used Agents to Produce a 6,000-Line TLAPS Proof

We recently made progress in the formal verification of distributed protocols. Building on our prior work on automatically inferring inductive invariants for distributed protocols, we used an agentic harness to automate the generation of TLAPS proof scripts. This post focuses on that proof-generation step.

More specifically, for a protocol at the scale of MongoLoglessDynamicRaft, we have connected the following pipeline:

Automatically infer inductive invariants -> automatically write TLAPS proofs -> prove safety properties

Most importantly, the TLAPS proof phase receives no protocol-specific prompts or manually written lemmas. The agent sees the protocol, inductive invariants, proof goal, and general proof feedback; by repeatedly advancing proof obligations, it eventually produces a complete proof.

This proof draft contains 6,308 lines of TLAPS code, all generated by the agent.

Proof file:

MongoLoglessDynamicRaft_IndAuto_ProofDraft.tla

Background: Why This Is Difficult

Safety proofs for distributed protocols usually face two hurdles. The first is finding sufficiently strong inductive invariants. This is difficult because a protocol designer must identify conditions across many states and execution paths that are strong enough to imply the safety property while remaining preserved by every protocol action. The second is turning these invariants into mechanically checked TLAPS proofs, which is often more time-consuming: the proof must be decomposed into many fine-grained proof obligations, each supplied with the necessary assumptions, definition unfoldings, and intermediate reasoning. Historically, formally proving the safety of a realistic protocol has required substantial effort on both tasks. Our result automatically generates TLAPS-checkable proof scripts after automatically inferring inductive invariants.

What We Did

The target protocol in this experiment is MongoLoglessDynamicRaft. It is not a toy protocol: it includes dynamic configurations, terms, server states, and configuration propagation, making it structurally complex enough to reflect realistic distributed-protocol proofs.

We first used our prior work to automatically obtain inductive invariants for MongoLoglessDynamicRaft. We then provided the protocol, invariants, and proof goal to an agentic harness, which let the agent write the TLAPS proof automatically.

The TLAPS proof-generation process can be summarized in three steps:

Premise introduction: Given the current proof obligation, the agent identifies the protocol definitions, type constraints, action premises, and known invariants that must be introduced explicitly.
Proof-structure construction: The agent builds a proof structure for each action and clause, decomposing the inductive proof into local goals that can be advanced step by step.
Obligation discharge: The agent checks the current proof with TLAPS, reads the failure output, and adds finer-grained intermediate steps for unfinished obligations until the proof passes.

Importantly, this is not a hand-written proof strategy tailored to MongoLoglessDynamicRaft. The harness contains no special knowledge of the protocol, nor does it receive manual hints such as “use a protocol-specific lemma here.” The agent relies on a general proof-feedback loop to complete the proof step by step.

Results

The resulting proof file is a complete TLAPS proof draft:

MongoLoglessDynamicRaft_IndAuto_ProofDraft.tla

The file contains 6,308 lines. The proof is split into many small steps: it first establishes basic facts about the protocol's variables, configurations, and state transitions; it then proves, for each protocol action and invariant clause, that the automatically inferred inductive invariants remain true after every step; finally, these invariants imply the target safety property.

The key point is not the complexity of any individual proof step, but the agent's ability to continuously generate and repair a large number of proof steps, organizing them into a complete, checkable TLAPS proof file.

What Remains Unsolved

So far, we have demonstrated the pipeline from automatic inductive-invariant inference to automatic TLAPS proof generation for a protocol at the scale of MongoLoglessDynamicRaft. Other protocols can differ in state space, action structure, invariant form, and proof style. This method still needs to be tested for stability across more protocols, especially those with more complex log structures, safety invariants, and compositional reasoning requirements.

From an engineering perspective, three practical issues remain: token consumption is still high, proof states need more complete persistence and reuse, and proof closure must become faster. Next, I plan to improve the harness scripts and associated persistence so that the agent can continue from a structured proof state instead of rebuilding its full understanding of the context in every round, and can identify and discharge open proof obligations more quickly.