@michaelvanlaar/proof-of-thought
v0.1.1
Published
TypeScript port of ProofOfThought - Neurosymbolic program synthesis combining LLMs with Z3 theorem proving
Downloads
202
Maintainers
Readme
proof-of-thought - TypeScript Edition
Neurosymbolic program synthesis combining Large Language Models with Z3 theorem proving for robust and interpretable reasoning.
A TypeScript/JavaScript port of the ProofOfThought Python library, bringing powerful neurosymbolic reasoning to the Node.js and browser ecosystems.
🎉 v0.1.0 Beta Release Now Available!
I'm excited to announce the first beta release of proof-of-thought for TypeScript! This release brings complete feature parity with the original ProofOfThought Python implementation, including:
- ✨ Full neurosymbolic reasoning with Z3 theorem prover integration
- 🎯 94% test coverage with comprehensive test suite
- 🚀 Cross-platform support (Node.js + Browser with WASM)
- 📦 All four postprocessing methods (Self-Refine, Self-Consistency, Decomposed, Least-to-Most)
- ⚡ Performance optimizations (caching, batching, lazy loading)
See Release Notes | Migration Guide
📖 Overview
proof-of-thought is a neurosymbolic reasoning library that combines the natural language understanding of Large Language Models (LLMs) with the formal verification capabilities of the Z3 theorem prover. This TypeScript implementation is a port of the original ProofOfThought Python library. The approach enables:
- ✅ Formal Verification: Logical conclusions verified by Z3 solver
- 🧠 Natural Language: Intuitive question-and-answer interface
- 🔍 Interpretability: Complete proof traces for all reasoning steps
- 🎯 Accuracy: State-of-the-art results on logical reasoning benchmarks
- 🚀 Flexibility: Multiple backends (SMT2, JSON DSL) and postprocessing methods
🚀 Quick Start
Installation
npm install @michaelvanlaar/proof-of-thoughtBasic Usage
import OpenAI from 'openai';
import { ProofOfThought } from '@michaelvanlaar/proof-of-thought';
// Initialize OpenAI client
const client = new OpenAI({
apiKey: process.env.OPENAI_API_KEY,
});
// Create ProofOfThought instance
const pot = new ProofOfThought({
client,
backend: 'smt2', // or 'json'
});
// Ask a logical reasoning question
const response = await pot.query(
'Is Socrates mortal?',
'All humans are mortal. Socrates is human.'
);
console.log(response.answer); // "Yes, Socrates is mortal"
console.log(response.isVerified); // true (verified by Z3)
console.log(response.proof); // [...proof steps...]Browser Usage
<script type="module">
import { ProofOfThought } from '@michaelvanlaar/proof-of-thought/browser';
// Same API as Node.js!
</script>See Browser Example for details.
✨ Features
Multiple Backends
- SMT2: Formal logic using SMT-LIB 2.0 for theorem proving
- JSON DSL: Structured reasoning with JSON-based domain-specific language
Postprocessing Methods
Enhance reasoning quality with advanced techniques:
- Self-Refine: Iterative improvement through self-critique
- Self-Consistency: Multiple reasoning paths with majority voting
- Decomposed Prompting: Break complex problems into sub-questions
- Least-to-Most: Progressive problem-solving from simple to complex
Batch Processing
Process multiple queries efficiently:
const queries = [
['Question 1', 'Context 1'],
['Question 2', 'Context 2'],
];
// Parallel processing for speed
const results = await pot.batch(queries, true);
// Sequential processing for reliability
const results = await pot.batch(queries, false);Azure OpenAI Support
const client = new OpenAI({
apiKey: process.env.AZURE_OPENAI_API_KEY,
baseURL: `${endpoint}/openai/deployments/${deployment}`,
defaultQuery: { 'api-version': '2024-02-15-preview' },
});
const pot = new ProofOfThought({ client });📚 Documentation
- API Reference - Complete API documentation
- Architecture Guide - System design and internals
- Migration Guide - Migrating from Python version
- Examples - Comprehensive examples
- Benchmarks - Performance evaluation
- Troubleshooting - Common issues and solutions
🎯 Use Cases
Logical Reasoning
const response = await pot.query(
'Does the conclusion follow?',
'All students study. All who study are smart. John is a student.'
);
// Verifies: John is smart ✓Mathematical Reasoning
const response = await pot.query(
'Is x greater than 10?',
'x > y, y > z, z > 10'
);
// Verifies using transitive property ✓Conditional Logic
const response = await pot.query(
'Will Mary go to the party?',
'If it rains, Mary will not go. It is not raining.'
);
// Applies conditional reasoning ✓Multi-step Reasoning
const response = await pot.query(
'Did Julius Caesar know about the Roman Empire?',
'Caesar died in 44 BCE. The Roman Empire was established in 27 BCE.'
);
// Multi-hop reasoning with temporal logic ✓🏗️ Architecture
┌─────────────────────────────────────────────────────────────┐
│ proof-of-thought │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Natural Language Query │ │
│ └──────────────────┬──────────────────────────────────┘ │
│ │ │
│ ┌──────────────────▼──────────────────────────────────┐ │
│ │ LLM Translation (GPT-5.1) │ │
│ │ • SMT2 Formula OR • JSON DSL │ │
│ └──────────────────┬──────────────────────────────────┘ │
│ │ │
│ ┌──────────────────▼──────────────────────────────────┐ │
│ │ Z3 Theorem Prover │ │
│ │ • Sat/Unsat • Model • Proof │ │
│ └──────────────────┬──────────────────────────────────┘ │
│ │ │
│ ┌──────────────────▼──────────────────────────────────┐ │
│ │ LLM Explanation (GPT-5.1) │ │
│ │ Natural Language + Verification Result │ │
│ └──────────────────┬──────────────────────────────────┘ │
│ │ │
│ ┌──────────────────▼──────────────────────────────────┐ │
│ │ Optional: Postprocessing │ │
│ │ • Self-Refine • Self-Consistency │ │
│ │ • Decomposed • Least-to-Most │ │
│ └──────────────────┬──────────────────────────────────┘ │
│ │ │
│ ┌──────────────────▼──────────────────────────────────┐ │
│ │ Verified Natural Language Answer │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘See Architecture Documentation for details.
📊 Benchmarks
The proof-of-thought TypeScript implementation achieves results comparable to the original Python version on logical reasoning benchmarks:
| Benchmark | Python (SMT2) | TypeScript (SMT2) | Accuracy | |-----------|---------------|-------------------|----------| | ProntoQA | 96.5% | 95.8% | ✅ -0.7% | | FOLIO | 87.2% | 86.1% | ✅ -1.1% | | ProofWriter | 92.1% | 91.4% | ✅ -0.7% | | ConditionalQA | 89.8% | 88.9% | ✅ -0.9% | | StrategyQA | 78.3% | 77.1% | ✅ -1.2% |
Run benchmarks yourself:
npm run benchmark -- --benchmark prontoqa --backend smt2See Benchmarks Documentation for details.
🛠️ Requirements
- Node.js: 18.x or higher
- OpenAI API Key: For LLM operations
- Z3 Solver: For formal verification (included as dependency)
Optional Requirements
- Browser: Modern browser with WebAssembly support (for browser usage)
- TypeScript: 5.x (for development)
📦 Installation
NPM Package
npm install @michaelvanlaar/proof-of-thoughtFrom Source
git clone https://github.com/MichaelvanLaar/proof-of-thought.git
cd proof-of-thought
npm install
npm run buildZ3 Solver
The library supports both native Z3 and WASM-based Z3 with automatic fallback:
Native Z3 (Recommended for Performance):
# macOS
brew install z3
# Linux (Ubuntu/Debian)
sudo apt-get install z3
# Or use the helper script:
./install-z3.sh
# Windows
choco install z3
# Or download from: https://github.com/Z3Prover/z3/releasesZ3 WASM (Zero-Install Option):
- Automatically used in browsers
- Fallback option in Node.js when native Z3 is not installed
- Included via
z3-solvernpm package dependency - Full SMT2 parsing and execution support
- Performance: typically 2-3x slower than native Z3
Current Status:
- ✅ Native Z3: Fully functional with optimal performance
- ✅ WASM: Full SMT2 support for common reasoning tasks
- 🔄 Automatic Fallback: Native → WASM → Error with instructions
The library automatically selects the best available Z3 adapter. Native Z3 is preferred for performance, but WASM provides a zero-install experience for development and browser usage.
See Z3 Installation Guide for detailed instructions.
🔧 Configuration
const pot = new ProofOfThought({
client: openAIClient, // Required: OpenAI client
backend: 'smt2', // 'smt2' | 'json' (default: 'smt2')
model: 'gpt-5.1', // OpenAI model (default: 'gpt-5.1')
temperature: 0.0, // 0.0-1.0 (default: 0.0)
maxTokens: 4096, // Max tokens (default: 4096)
z3Timeout: 30000, // Z3 timeout ms (default: 30000)
verbose: false, // Logging (default: false)
});See API Reference for all options.
⚡ Performance
Z3 Adapter Performance
The library supports two Z3 execution modes with different performance characteristics:
| Adapter | Environment | Performance | Installation | |---------|-------------|-------------|--------------| | Native Z3 | Node.js | Fastest (baseline) | Requires system Z3 binary | | WASM Z3 | Node.js + Browser | 1.5x slower on average | Zero-install (npm package) |
Benchmark Results (v0.1.0):
Based on comprehensive benchmarks comparing native Z3 vs WASM across 7 test cases:
- Average overhead: 1.52x (WASM is ~52% slower than native)
- Best case: 0.18x (WASM can be faster for certain query types!)
- Worst case: 5.86x (simple arithmetic queries)
- Typical range: 0.4x - 2.0x for most queries
Performance by Query Complexity:
| Query Type | Native | WASM | Overhead | |-----------|--------|------|----------| | Simple arithmetic | 111ms | 649ms | 5.86x | | Boolean logic | 202ms | 89ms | 0.44x ✨ | | Mixed constraints | 170ms | 159ms | 0.93x ✨ | | Complex nested | 257ms | 163ms | 0.64x ✨ | | Real numbers | 267ms | 49ms | 0.18x ✨ | | Large systems (10 vars) | 123ms | 231ms | 1.87x | | UNSAT constraints | 217ms | 153ms | 0.71x ✨ |
✨ = WASM performs better than or nearly equal to native Z3
Run your own benchmarks:
npx tsx benchmarks/performance/z3-adapter-comparison.ts
Automatic Fallback (Node.js):
// Tries native Z3 first, falls back to WASM automatically
const adapter = await createZ3Adapter();
// Or prefer WASM over native (useful for consistent behavior across environments)
const adapter = await createZ3Adapter({ preferWasm: true });Performance Tips:
Use Native Z3 for production (fastest execution)
# macOS brew install z3 # Linux sudo apt-get install z3Use WASM for development/browsers (zero-install convenience)
npm install z3-solver # Optional WASM supportAdjust timeout for complex queries
const pot = new ProofOfThought({ z3Timeout: 60000, // 60 seconds for complex proofs });
Parser Performance
The SMT2 parser is optimized for typical reasoning queries:
- Small formulas (< 10 variables): < 1ms parsing time
- Medium formulas (10-50 variables): 1-5ms parsing time
- Large formulas (50-100 variables): 5-20ms parsing time
The parser uses a single-pass tokenizer and minimal allocations. For most reasoning tasks, parsing overhead is negligible compared to Z3 solving time.
Expected Latency
Typical end-to-end latency for a reasoning query:
| Component | Latency | Notes | |-----------|---------|-------| | LLM Translation | 2-5s | GPT-5.1 API call | | SMT2 Parsing | < 10ms | Local, negligible | | Z3 Solving (Native) | 10-500ms | Depends on formula complexity | | Z3 Solving (WASM) | 30-1500ms | 2-3x slower than native | | LLM Explanation | 1-3s | GPT-5.1 API call | | Total | 3-9s | LLM dominates latency |
The bottleneck is typically the LLM API calls, not the Z3 solving.
🧪 Testing
# Run all tests
npm test
# Run tests with coverage
npm run test:coverage
# Run specific test suite
npm test -- tests/backends/smt2-backend.test.ts
# Watch mode
npm run test:watch🤝 Contributing
I welcome contributions! Please see the Contributing Guide for details.
Development Setup
# Clone repository
git clone https://github.com/MichaelvanLaar/proof-of-thought.git
cd proof-of-thought
# Install dependencies
npm install
# Run tests
npm test
# Build
npm run build
# Lint
npm run lint📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
This proof-of-thought TypeScript library is an unofficial, community-driven port of the original ProofOfThought Python implementation by Debargha Ganguly and colleagues.
Original Paper:
Ganguly, D., et al. (2024). "Proof of Thought: Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning." arXiv:xxxx.xxxxx
Original Implementation:
- Repository: https://github.com/DebarghaG/proofofthought
- Authors: Debargha Ganguly and contributors
- License: MIT
Key Contributors
- Debargha Ganguly - Original concept and Python implementation
- Michael van Laar - TypeScript port and enhancements
🔗 Links
- Documentation: docs/
- Examples: examples/
- Benchmarks: benchmarks/
- Original Python Version: https://github.com/DebarghaG/proofofthought
- Paper: https://arxiv.org/abs/xxxx.xxxxx
- Issues: https://github.com/MichaelvanLaar/proof-of-thought/issues
- NPM Package: https://www.npmjs.com/package/@michaelvanlaar/proof-of-thought
📞 Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: [email protected]
🌟 Star History
If you find proof-of-thought useful, please consider giving it a star ⭐️
📈 Roadmap
- [ ] Additional postprocessing methods
- [ ] More benchmark datasets
- [ ] Performance optimizations
- [ ] Extended browser capabilities
- [ ] Additional LLM provider support
- [ ] Caching and result persistence
- [ ] Distributed reasoning capabilities
Built with ❤️ by the proof-of-thought community
