atp-lptp-mcp-server

v0.9.0

Published

2 days ago

MCP server for LPTP (Logic Program Theorem Prover) — proof verification, tactic application, and definition querying

0High
0Medium
0Low

thierrymarianne

mcp lptp theorem-prover prolog formal-verification

atp-lptp-mcp-server

An MCP (Model Context Protocol) server that exposes LPTP (Logic Program Theorem Prover) operations as tools — proof verification, tactic application, definition querying, ground representation compilation, and proof-file analysis. Any MCP-compatible client (Claude Code, Claude Desktop, etc.) can connect and use these tools to interactively construct and verify formal proofs of Prolog programs.

Prerequisites

Node.js >= 18
LPTP source checkout — the directory containing src/lptp.pl and the lib/ hierarchy
At least one Prolog backend on PATH:
- SWI-Prolog (swipl) — recommended, best resource-limit support
- Scryer Prolog (scryer-prolog) — ISO-compliant alternative
- GNU Prolog (bin/lptp-gnu-prolog) — pre-compiled binary with LPTP baked in
E Theorem Prover (eprover) — optional, needed for apply_eprover
Vampire (vampire) — optional, needed for apply_vampire

Installation

npm install atp-lptp-mcp-server

Or install from a local checkout:

cd etc/atp-lptp-mcp-server
npm install
npm run build

Note: The build step also copies the Prolog helper scripts (src/prolog/) into build/prolog/.

Configuration

Claude Code

Add to your .mcp.json (project or global):

{
  "mcpServers": {
    "atp-lptp-mcp-server": {
      "type": "stdio",
      "command": "npx",
      "args": ["atp-lptp-mcp-server"],
      "env": {
        "LPTP_ROOT_DIR": "/path/to/your/lptp/checkout"
      }
    }
  }
}

Environment Variables

| Variable | Required | Description | |----------|----------|-------------| | LPTP_ROOT_DIR | Yes* | Path to the LPTP checkout root (directory containing src/lptp.pl). *If the server is installed inside the LPTP tree, it can auto-detect. | | LPTP_PROLOG_BACKEND | No | Force a specific backend: swipl, scryer, or gprolog. Default: auto-detect in that order. |

Available Tools

Proof Verification

| Tool | Description | |------|-------------| | verify_lemma | Validates a snippet of LPTP proof code (lemma/3, theorem/3, etc.) by running it through SWI-Prolog. Supports injecting external context or axioms. | | check_proof | Runs LPTP's native check/1 on an existing .pr proof file on disk. Returns all verification errors or confirms the proof is valid. |

Tactics

| Tool | Description | |------|-------------| | apply_tactic | Applies any LPTP tactic (e.g. [ind], [auto(5)], completion) to a formula and returns the expanded derivation tree. | | tactic_auto | Attempts automatic proof with configurable search depth and resource limits (inference_limit, time_limit_seconds). | | tactic_case | Case splitting on disjunctions or pattern matches. | | tactic_comp | Completion formula (fixed-point semantics of program clauses). | | tactic_elim | Predicate definition elimination. | | tactic_ex | Existential quantifier instantiation. | | tactic_fact | Apply a known fact (lemma, theorem, corollary, or axiom). | | tactic_ind | Structural induction on an inductively defined predicate. | | tactic_indqf | Quantifier-free induction (induction without universally quantified hypothesis). | | tactic_tot | Totality (termination) case splitting. | | tactic_unfold | Unfold an abbreviation or definition. | | tactic_debug | Show available formulas and assumptions in the current proof context. |

Query

| Tool | Description | |------|-------------| | print_definition | Print the completion definition of a predicate — IND(P) fixed-point axioms (success, failure, termination) generated from program clauses. | | list_facts | List all known theorems, lemmas, and corollaries about a predicate. |

Compilation & Analysis

| Tool | Description | |------|-------------| | compile_gr | Compile a .pl Prolog program into its LPTP ground representation (.gr). Accepts LPTP alias paths (e.g. $(examples)/gen_99p/list_ops) or filesystem paths; omit the .pl extension. | | count_pr_directives | Count LPTP directives in a .pr proof file using read_term/3 with strict functor+arity matching. Returns counts for all 13 directive types (preamble + body declarations) as a JSON object and a markdown table. All keys are always present (zero-filled). |

Directives counted by `count_pr_directives`

| Category | Directive | JSON key | |----------|-----------|----------| | Preamble | needs_gr/1 | needs_gr | | Preamble | needs_thm/1 | needs_thm | | Preamble | compile_gr/1 | compile_gr | | Preamble | thm_file/1 | thm_file | | Preamble | tex_file/1 | tex_file | | Preamble | initialize/0 | initialize | | Preamble | bye/1 | bye | | Body | definition_pred/3 | def_pred | | Body | definition_fun/5 | def_fun | | Body | axiom/2 | axiom | | Body | lemma/3 | lemma | | Body | corollary/3 | corollary | | Body | theorem/3 | theorem |

FOF Compilation & ATP Solvers

| Tool | Description | |------|-------------| | build_fof | Compiles a _reg_atp.pr proof file to TPTP First-Order Form (.fof) axioms and conjectures via src/indp.sh (IND(P) translation). Generates one .fof file per lemma/theorem/corollary. Requires the -all logic program variant (make build). Output is suitable for Vampire and E-Prover. | | apply_eprover | Runs E Theorem Prover (--auto-schedule=8, --definitional-cnf, --delete-bad-limit) on a single .fof conjecture file. Returns PROVED / NOT PROVED and full prover output. Requires eprover on PATH. | | apply_vampire | Runs Vampire (CASC mode: --mode casc, -m 16384, --cores 7) on a single .fof conjecture file. Returns PROVED / NOT PROVED and full prover output. Requires vampire on PATH. |

Proof Synthesis

| Tool | Description | |------|-------------| | format_proof | Validates a .pr proof file via LPTP's check/1, then pretty-prints each proof term via prt__write/1 at 76-char width. Fails fast with a diagnostic if the proof is invalid — never formats an invalid proof. Returns canonical LPTP-formatted proof text. | | generate_groundness_proof | 3-phase groundness proof pipeline: (1) infer groundness invariants via abstract interpretation, (2) construct the LPTP proof, (3) certify via check/1. Returns HAL-04859406 benchmark metrics: Prog · Prop · Vars · Inf.ms · Deriv.ms · Cert.ms. | | generate_regular_approximation_proof | Regular type approximation proof synthesis via cGA_LPTP / ABS_INT_MODE=prove_regular_types. Certifies the proof via check/1. Returns Prog · Prop · Types · Inf.(μs) · Checks(ms) · Deriv.(ms) · Cert.(ms) · LOC. At most 2 calls run concurrently. | | generate_regular_approximation_proof_with_goal | Call-answer (magic-set) transformation from a top goal followed by regular type proof synthesis at increased precision (L1/L2 vs L0). Applies ABS_INT_MODE=prove_regular_types_with_goal via cGA_LPTP. Generates <stem>_ca.pl (magic-set-transformed program), <stem>_reg_ca.pl (bottom-up fixpoint), then synthesises and certifies <stem>_reg_ca_formatted.pr via check/1. Returns Prog · Goal · Prop · Types · Inf.(μs) · Checks(ms) · Deriv.(ms) · Cert.(ms) · LOC. At most 2 synthesis calls run concurrently. |

Build

| Tool | Description | |------|-------------| | build_lptp | Rebuild src/lptp.pl from source files (including bi_extended.pl) by running make test-library-with-swi-prolog. Use after editing src/bi_extended.pl to regenerate the concatenated LPTP source and verify all library proofs still pass. |

Prolog Backend Selection

The server auto-detects backends in this order:

SWI-Prolog (swipl) — full resource limits (time, inference, depth)
Scryer Prolog (scryer-prolog) — inference limits only, Node.js timeout for time
GNU Prolog (bin/lptp-gnu-prolog) — Node.js timeout only

Override with LPTP_PROLOG_BACKEND=swipl (or scryer, gprolog).

Testing

Tests are written with Jest + ts-jest and follow the RED → GREEN → REFACTOR cycle enforced by the /tdd-right-away skill.

Run all tests

cd etc/atp-lptp-mcp-server
npm test

Set LPTP_ROOT_DIR to your LPTP checkout root for tests that exercise real proof files:

LPTP_ROOT_DIR=/path/to/lptp/checkout npm test

Test suites

`grammarSections.test.ts` — Grammar unit tests (no dependencies)

Validates the GRAMMAR_SECTIONS map and helper functions (getGrammarSection, getFullGrammar). All tests run without LPTP_ROOT_DIR.

| Test group | What it covers | |---|---| | GRAMMAR_SECTIONS | 10 sections exported, all keys present, each a non-empty string | | getGrammarSection | Returns named section; undefined for unknown key | | getFullGrammar | All 10 section headers present in concatenated output | | Section content — terms | Variables (?name), compound terms, list syntax, operators | | Section content — goals | true, fail, equations, negation, conjunction, disjunction | | Section content — formulas | succeeds/fails/terminates, quantifiers, <=>, gr(term) | | Section content — internal_repr | $(name), tag expressions, @(tag,…), S/F/T functions | | Section content — derivation_steps | All 8 step types: assume, cases, exist, induction, contra, indirect | | Section content — by_tags | All 13 tags: theorem, lemma, corollary, axiom, completion, sld, … | | Section content — inference_rules | Identity, modus ponens (44.8 %), unification, inconsistency, totality | | Section content — tactics | All tactic names: auto(n), case, comp, elim, ind, indqf, tot, unfold, debug | | Section content — precedences | Full operator table (100–980, xfy, yfx, …) | | Section content — definitions | definition_pred, definition_fun, axiom, existence, uniqueness | | Integration | getFullGrammar backward-compatible with old terms section content |

`mcp_lptp_server.test.ts` — Executor integration tests

Requires LPTP_ROOT_DIR for proof-file tests; skips gracefully when files are absent.

| Test group | Key assertions | |---|---| | verifyLemma | Valid lemma → success: true, output contains 'parsed and verified successfully' | | checkProof | Returns defined string output for an existing .pr file | | applyTactic — auto(N) | success: true, derivation defined | | applyTactic — ind | derivation contains 'induction(' | | applyTactic — case | derivation contains 'by gap' | | applyTactic — elim / fact / tot / unfold / ex / comp / indqf | success: true for each | | countPrDirectives — 13 keys zero-filled | All 13 keys present; bye: 1 for :- bye(done). | | countPrDirectives — strict arity | lemma/3 counted; lemma/2 ignored | | countPrDirectives — markdown table | All 13 column headers present (ngr, lem, thm_b, …) | | countPrDirectives — missing file | Rejects with "File not found" before invoking Prolog | | AtpSemaphore | At most 2 concurrent ATP calls; 5 parallel applyEProver calls all return strings | | buildFof | Returns string output for a valid stem path | | applyEProver — no binary | Returns string output with install hint | | applyEProver — real .fof | Output contains 'PROVED' or 'NOT PROVED' | | applyVampire — no binary | Returns string output | | applyVampire — real .fof | Output contains 'PROVED' or 'NOT PROVED' | | formatProof — valid file | Non-empty string output | | formatProof — invalid proof | success: false, output contains 'proof validation failed' | | formatProof — missing file | success: false, output contains 'not found' | | generateGroundnessProof — fields | All 9 result fields present for any input | | generateGroundnessProof — apprev.pl | success: true, prop_count > 0, all timing fields ≥ 0, proof contains lemma(, valid: true | | generateGroundnessProof — missing file | success: false, output contains 'not found' | | generateRegularApproxProof — fields | All 13 RegularApproxProofResult keys present | | generateRegularApproxProof — nat.pl | success: true, all metric fields populated, proof contains lemma(, valid: true | | generateRegularApproxProof — missing file | success: false, output contains 'not found' |

`proof_repairs.test.ts` — TDD proof repair suite

Validates that all repaired .pr files pass check/1 with zero errors and zero warnings. Requires LPTP_ROOT_DIR. Run a single group to target one repair category:

# Category B — duplicate directives (6 files, 60 s timeout each)
npx jest proof_repairs -t "Category B"

# Category B+C — duplicates + predicate mismatches (3 files)
npx jest proof_repairs -t "Category B\+C"

# Category C — untransposed source predicates (8 files)
npx jest proof_repairs -t "Category C"

# Category A — definition_fun iterative repair (10 files, 120 s timeout each)
npx jest proof_repairs -t "Category A"

| Category | Files | Root cause | |---|---|---| | A — definition_fun repair | ancestry_bidirectional_lineage.pr, dna_codon_offset.pr, dna_terminal.pr, heaps_heap_priority.pr, heaps_min_path.pr, heaps_mirror_heap.pr, org_chart_reporting_level.pr, polynomial_arithmetic_poly_coefficient.pr, organizational_hierarchies_hierarchical_level.pr, supply_chain_material_priority.pr | definition_fun existence/uniqueness axiom mismatch | | B — duplicate directives | heaps.pr, heaps_reheapify.pr, dna_complement_flip.pr, org_chart.pr, org_chart_org_re.pr, org_chart_subordinate_chain.pr | Duplicate definition_pred/definition_fun entries | | B+C — duplicates + mismatches | heaps_heap_collection.pr, heaps_heapify.pr, org_chart_reorganization.pr | Both duplicate directives and untransposed predicates | | C — predicate mismatch | dna_re.pr, dna_sequence.pr, dna_recombine.pr, org_chart_sub.pr, organizational_hierarchies_rank_by_seniority.pr, polynomial_arithmetic_polynomial_term.pr, supply_chain_optimize_flow.pr, tournament_brackets_arrange.pr | Source-domain predicates (integer, =<) not transposed |

TDD workflow with `/tdd-right-away`

Each repair cycle follows RED → GREEN → REFACTOR:

RED — Run the failing test to confirm the error:

cd etc/atp-lptp-mcp-server
LPTP_ROOT_DIR=../.. npx jest proof_repairs -t "<file stem>" --no-coverage

GREEN — Edit the .pr file to remove duplicates or replace source predicates until check/1 returns o.k.
REFACTOR — Re-run the full suite to confirm no regressions:
```
LPTP_ROOT_DIR=../.. npm test
```

License

MIT — see LICENSE. The LPTP Prolog source (src/lptp.pl) has its own license (R.F. Staerk, 1996).

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

atp-lptp-mcp-server

Prerequisites

Installation

Configuration

Claude Code

Environment Variables

Available Tools

Meta

Proof Verification

Tactics

Query

Compilation & Analysis

Directives counted by count_pr_directives

FOF Compilation & ATP Solvers

Proof Synthesis

Build

Prolog Backend Selection

Testing

Run all tests

Test suites

grammarSections.test.ts — Grammar unit tests (no dependencies)

mcp_lptp_server.test.ts — Executor integration tests

proof_repairs.test.ts — TDD proof repair suite

TDD workflow with /tdd-right-away

License

Directives counted by `count_pr_directives`

`grammarSections.test.ts` — Grammar unit tests (no dependencies)

`mcp_lptp_server.test.ts` — Executor integration tests

`proof_repairs.test.ts` — TDD proof repair suite

TDD workflow with `/tdd-right-away`