pi-autocontext-erdos
v0.1.8
Published
Pi package for AutoContext-assisted Erdős-style formal math problem loops with Lean verification.
Maintainers
Readme
pi-autocontext-erdos
Pi package for AutoContext-assisted Erdős-style formal math problem loops with Lean verification.
Latest published package: [email protected]. Release evidence is recorded in docs/RELEASE_0.1.7.md. Main is preparing 0.1.8 with a bounded Erdős–Straus certificate-search benchmark.
This package is an orchestration layer. It does not replace Lean and does not claim to prove open problems. It helps structure a math exploration loop: problem intake, formalization, lemma planning, proof attempts, Lean-backed verification, and reporting.
Relationship to pi-autocontext-lean-verify
Use this package for the problem loop and research workflow:
pi-autocontext-erdos
= problem intake + formalization + lemma ladder + attempt logs + reportsUse pi-autocontext-lean-verify as the verifier/proof-repair backend:
pi-autocontext-lean-verify
= Lean oracle + fixed-template proof-body repair + guardrail checksInstall both when you want proof-repair automation:
pi install npm:[email protected]
pi install git:github.com/greyhaven-ai/pi-autocontext-erdosFrom a local checkout:
pi install ./pi-autocontext-erdosFor local development, from this checkout:
npm install
npm run validate
npm run pack:dry-run
pi install ./pi-autocontext-erdos --localProvided resources
- Extension tool:
autocontext_erdos - Skill:
erdos-math-loop - Prompt template:
erdos-math-loop.md - Domain notes:
docs/DOMAIN.md - Validation notes:
docs/VALIDATION.md - Lake/Mathlib verifier notes:
docs/LAKE_MATHLIB_SUPPORT.md - Mathlib benchmark evidence:
docs/MATHLIB_BENCHMARK.md - Final theorem audit guardrails:
docs/THEOREM_AUDIT.md - Lemma graph orchestration:
docs/LEMMA_GRAPH.md - Erdős–Straus parametric family benchmark:
docs/ERDOS_STRAUS_PARAMETRIC_FAMILIES.md - Erdős–Straus bounded certificate-search benchmark:
docs/ERDOS_STRAUS_BOUNDED_SEARCH.md - Example runs:
examples/erdos-ginzburg-ziv-n2examples/erdos-ginzburg-ziv-n3examples/schur-two-color-n5examples/schur-two-color-n5-lemma-graphexamples/mathlib-divisibility-primeexamples/erdos-straus-parametric-familiesexamples/erdos-straus-bounded-search
Example validation
The checked examples include the n = 2 and n = 3 finite instances of the Erdős–Ginzburg–Ziv theorem over natural numbers, plus a full-pipeline finite Schur two-color n = 5 benchmark. The n = 2 EGZ example proves:
theorem egz_two_nat (a b c : Nat) :
(a + b) % 2 = 0 ∨ (a + c) % 2 = 0 ∨ (b + c) % 2 = 0The harder EGZ n = 3 example proves that among any five natural numbers, one of the ten triples has sum divisible by 3. The AutoContext repair smoke in docs/EGZ_N3_AUTOCONTEXT_REPAIR.md deliberately breaks that proof, repairs it back to a Lean-verified candidate with autocontext_erdos, and the compact repaired proof is now the canonical example.
The Schur n = 5 example proves the finite two-color statement that every coloring of 1..5 contains a monochromatic x + y = z triple. It exercises direct Lean verification, audit_theorems, fixed-template export, and pi-autocontext-lean-verify handoff; see docs/SCHUR_N5_FULL_PIPELINE.md.
The Mathlib divisibility-prime benchmark imports Mathlib in a Lake project and proves a scoped Euclid-lemma API theorem using Nat.Prime.dvd_mul; see docs/MATHLIB_BENCHMARK.md.
The Erdős–Straus parametric-family example verifies only the elementary even-denominator family as a rational identity over ℚ; it is explicitly not a proof of the full conjecture. It exercises Lake/Mathlib verification, lemma graph frontier tracking, audit_theorems, fixed-template export, and graph result recording; see docs/ERDOS_STRAUS_PARAMETRIC_FAMILIES.md.
The Erdős–Straus bounded-search example generates exact certificates for 2 ≤ n ≤ 100, then verifies the emitted finite theorem in Lean/Mathlib. The JavaScript search is only a candidate generator; Lean remains the correctness oracle. See docs/ERDOS_STRAUS_BOUNDED_SEARCH.md.
Validate the package and examples with:
npm run validate
npm run typecheck
npm run validate:extension
npm run validate:example
npm run validate:lake-exampleAutoContext-backed tool
The package registers one consolidated Pi tool:
autocontext_erdosActions:
preflight: check Lean anduvx autoctx improveavailability.init_problem: create the run artifact layout.verify_file: run Lean on a formalization and write attempt artifacts.autocontext_improve: invokeautoctx improvewith a Lean verifier command and save prompt/rubric/logs/best candidate/verdict.export_fixed_template: freeze a named theorem into aTheorem.template.leanfixture with a{{PROOF}}hole forpi-autocontext-lean-verifyproof-body repair.audit_theorems: final-artifact audit that verifies the Lean file, runs#print axioms, checks forbidden constructs, compares imports to an optional baseline, and records environment/toolchain hashes.lemma_graph: initialize, validate, summarize, advance, and record results for dependency graphs underlemmas/lemma_graph.json.summarize_run: summarize files and attempt verdicts under a run root.
The AutoContext action uses direct Lean verification by default:
uvx --python 3.12 --from autocontext==0.5.1 autoctx improve ... --verify-cmd "<lean> {file}"For Mathlib or local Lake-project imports, pass verificationMode="lake" and projectRoot; verification and AutoContext rechecks then use lake env lean <file>. init_problem can scaffold a Mathlib-aware Lake project with withMathlib=true. See docs/LAKE_MATHLIB_SUPPORT.md.
For whole-file repair, pass preserveTheorems to require exact normalized theorem-statement preservation before a Lean-verified candidate is accepted:
{
"action": "autocontext_improve",
"preserveTheorems": ["mod_three_eq_zero_or_one_or_two", "egz_three_nat"]
}If Lean accepts a candidate but a preserved theorem statement changes, the verdict is theorem_statement_changed with Accepted: false.
For the compact EGZ n = 3 repair loop, current validation recommends timeoutSeconds=240; timeoutSeconds=120 is useful but budget-limited for this task.
When a theorem statement is ready to freeze, use export_fixed_template to hand off proof-body-only repair to pi-autocontext-lean-verify; see docs/LEAN_VERIFY_HANDOFF.md.
Before treating a final theorem as a serious result, run audit_theorems; see docs/THEOREM_AUDIT.md.
MVP workflow
A typical run creates a durable folder such as:
runs/<problem-slug>/
problem.md
source_notes.md
formalization/
Formalization.lean
formalization_notes.md
lemmas/
lemma_plan.md
attempts/
0001/
prompt.md
candidate.lean
lean.stderr.txt
verdict.md
reports/
final_report.mdEvery mathematical claim in the final report must be labeled as one of:
- Lean verified;
- formalization draft;
- conjecture;
- heuristic/proof sketch;
- failed attempt.
Guardrail posture
Never count a proof unless Lean verifies the exact theorem or unchanged template. Do not use sorry, admit, new axioms, unsafe, theorem weakening, proof-by-importing-answer, or hidden assumptions as success criteria.
