@michaelvanlaar/proof-of-thought

v0.1.1

Published

3 months ago

TypeScript port of ProofOfThought - Neurosymbolic program synthesis combining LLMs with Z3 theorem proving

proof-of-thought - TypeScript Edition

Neurosymbolic program synthesis combining Large Language Models with Z3 theorem proving for robust and interpretable reasoning.

A TypeScript/JavaScript port of the ProofOfThought Python library, bringing powerful neurosymbolic reasoning to the Node.js and browser ecosystems.

🎉 v0.1.0 Beta Release Now Available!

I'm excited to announce the first beta release of proof-of-thought for TypeScript! This release brings complete feature parity with the original ProofOfThought Python implementation, including:

✨ Full neurosymbolic reasoning with Z3 theorem prover integration
🎯 94% test coverage with comprehensive test suite
🚀 Cross-platform support (Node.js + Browser with WASM)
📦 All four postprocessing methods (Self-Refine, Self-Consistency, Decomposed, Least-to-Most)
⚡ Performance optimizations (caching, batching, lazy loading)

See Release Notes | Migration Guide

📖 Overview

proof-of-thought is a neurosymbolic reasoning library that combines the natural language understanding of Large Language Models (LLMs) with the formal verification capabilities of the Z3 theorem prover. This TypeScript implementation is a port of the original ProofOfThought Python library. The approach enables:

✅ Formal Verification: Logical conclusions verified by Z3 solver
🧠 Natural Language: Intuitive question-and-answer interface
🔍 Interpretability: Complete proof traces for all reasoning steps
🎯 Accuracy: State-of-the-art results on logical reasoning benchmarks
🚀 Flexibility: Multiple backends (SMT2, JSON DSL) and postprocessing methods

🚀 Quick Start

Installation

npm install @michaelvanlaar/proof-of-thought

Basic Usage

import OpenAI from 'openai';
import { ProofOfThought } from '@michaelvanlaar/proof-of-thought';

// Initialize OpenAI client
const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

// Create ProofOfThought instance
const pot = new ProofOfThought({
  client,
  backend: 'smt2', // or 'json'
});

// Ask a logical reasoning question
const response = await pot.query(
  'Is Socrates mortal?',
  'All humans are mortal. Socrates is human.'
);

console.log(response.answer);      // "Yes, Socrates is mortal"
console.log(response.isVerified);  // true (verified by Z3)
console.log(response.proof);       // [...proof steps...]

Browser Usage

<script type="module">
  import { ProofOfThought } from '@michaelvanlaar/proof-of-thought/browser';
  // Same API as Node.js!
</script>

See Browser Example for details.

✨ Features

Multiple Backends

SMT2: Formal logic using SMT-LIB 2.0 for theorem proving
JSON DSL: Structured reasoning with JSON-based domain-specific language

Postprocessing Methods

Enhance reasoning quality with advanced techniques:

Self-Refine: Iterative improvement through self-critique
Self-Consistency: Multiple reasoning paths with majority voting
Decomposed Prompting: Break complex problems into sub-questions
Least-to-Most: Progressive problem-solving from simple to complex

Batch Processing

Process multiple queries efficiently:

const queries = [
  ['Question 1', 'Context 1'],
  ['Question 2', 'Context 2'],
];

// Parallel processing for speed
const results = await pot.batch(queries, true);

// Sequential processing for reliability
const results = await pot.batch(queries, false);

Azure OpenAI Support

const client = new OpenAI({
  apiKey: process.env.AZURE_OPENAI_API_KEY,
  baseURL: `${endpoint}/openai/deployments/${deployment}`,
  defaultQuery: { 'api-version': '2024-02-15-preview' },
});

const pot = new ProofOfThought({ client });

📚 Documentation

API Reference - Complete API documentation
Architecture Guide - System design and internals
Migration Guide - Migrating from Python version
Examples - Comprehensive examples
Benchmarks - Performance evaluation
Troubleshooting - Common issues and solutions

🎯 Use Cases

Logical Reasoning

const response = await pot.query(
  'Does the conclusion follow?',
  'All students study. All who study are smart. John is a student.'
);
// Verifies: John is smart ✓

Mathematical Reasoning

const response = await pot.query(
  'Is x greater than 10?',
  'x > y, y > z, z > 10'
);
// Verifies using transitive property ✓

Conditional Logic

const response = await pot.query(
  'Will Mary go to the party?',
  'If it rains, Mary will not go. It is not raining.'
);
// Applies conditional reasoning ✓

Multi-step Reasoning

const response = await pot.query(
  'Did Julius Caesar know about the Roman Empire?',
  'Caesar died in 44 BCE. The Roman Empire was established in 27 BCE.'
);
// Multi-hop reasoning with temporal logic ✓

🏗️ Architecture

┌─────────────────────────────────────────────────────────────┐
│                   proof-of-thought                          │
│  ┌─────────────────────────────────────────────────────┐   │
│  │              Natural Language Query                  │   │
│  └──────────────────┬──────────────────────────────────┘   │
│                     │                                        │
│  ┌──────────────────▼──────────────────────────────────┐   │
│  │           LLM Translation (GPT-5.1)                  │   │
│  │     • SMT2 Formula  OR  • JSON DSL                  │   │
│  └──────────────────┬──────────────────────────────────┘   │
│                     │                                        │
│  ┌──────────────────▼──────────────────────────────────┐   │
│  │           Z3 Theorem Prover                          │   │
│  │     • Sat/Unsat   • Model   • Proof                 │   │
│  └──────────────────┬──────────────────────────────────┘   │
│                     │                                        │
│  ┌──────────────────▼──────────────────────────────────┐   │
│  │           LLM Explanation (GPT-5.1)                  │   │
│  │     Natural Language + Verification Result          │   │
│  └──────────────────┬──────────────────────────────────┘   │
│                     │                                        │
│  ┌──────────────────▼──────────────────────────────────┐   │
│  │    Optional: Postprocessing                          │   │
│  │    • Self-Refine  • Self-Consistency                │   │
│  │    • Decomposed   • Least-to-Most                   │   │
│  └──────────────────┬──────────────────────────────────┘   │
│                     │                                        │
│  ┌──────────────────▼──────────────────────────────────┐   │
│  │         Verified Natural Language Answer             │   │
│  └─────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘

See Architecture Documentation for details.

📊 Benchmarks

The proof-of-thought TypeScript implementation achieves results comparable to the original Python version on logical reasoning benchmarks:

| Benchmark | Python (SMT2) | TypeScript (SMT2) | Accuracy | |-----------|---------------|-------------------|----------| | ProntoQA | 96.5% | 95.8% | ✅ -0.7% | | FOLIO | 87.2% | 86.1% | ✅ -1.1% | | ProofWriter | 92.1% | 91.4% | ✅ -0.7% | | ConditionalQA | 89.8% | 88.9% | ✅ -0.9% | | StrategyQA | 78.3% | 77.1% | ✅ -1.2% |

Run benchmarks yourself:

npm run benchmark -- --benchmark prontoqa --backend smt2

See Benchmarks Documentation for details.

🛠️ Requirements

Node.js: 18.x or higher
OpenAI API Key: For LLM operations
Z3 Solver: For formal verification (included as dependency)

Optional Requirements

Browser: Modern browser with WebAssembly support (for browser usage)
TypeScript: 5.x (for development)

📦 Installation

NPM Package

npm install @michaelvanlaar/proof-of-thought

From Source

git clone https://github.com/MichaelvanLaar/proof-of-thought.git
cd proof-of-thought
npm install
npm run build

Z3 Solver

The library supports both native Z3 and WASM-based Z3 with automatic fallback:

Native Z3 (Recommended for Performance):

# macOS
brew install z3

# Linux (Ubuntu/Debian)
sudo apt-get install z3

# Or use the helper script:
./install-z3.sh

# Windows
choco install z3
# Or download from: https://github.com/Z3Prover/z3/releases

Z3 WASM (Zero-Install Option):

Automatically used in browsers
Fallback option in Node.js when native Z3 is not installed
Included via z3-solver npm package dependency
Full SMT2 parsing and execution support
Performance: typically 2-3x slower than native Z3

Current Status:

✅ Native Z3: Fully functional with optimal performance
✅ WASM: Full SMT2 support for common reasoning tasks
🔄 Automatic Fallback: Native → WASM → Error with instructions

The library automatically selects the best available Z3 adapter. Native Z3 is preferred for performance, but WASM provides a zero-install experience for development and browser usage.

See Z3 Installation Guide for detailed instructions.

🔧 Configuration

const pot = new ProofOfThought({
  client: openAIClient,      // Required: OpenAI client
  backend: 'smt2',           // 'smt2' | 'json' (default: 'smt2')
  model: 'gpt-5.1',          // OpenAI model (default: 'gpt-5.1')
  temperature: 0.0,          // 0.0-1.0 (default: 0.0)
  maxTokens: 4096,           // Max tokens (default: 4096)
  z3Timeout: 30000,          // Z3 timeout ms (default: 30000)
  verbose: false,            // Logging (default: false)
});

See API Reference for all options.

⚡ Performance

Z3 Adapter Performance

The library supports two Z3 execution modes with different performance characteristics:

| Adapter | Environment | Performance | Installation | |---------|-------------|-------------|--------------| | Native Z3 | Node.js | Fastest (baseline) | Requires system Z3 binary | | WASM Z3 | Node.js + Browser | 1.5x slower on average | Zero-install (npm package) |

Benchmark Results (v0.1.0):

Based on comprehensive benchmarks comparing native Z3 vs WASM across 7 test cases:

Average overhead: 1.52x (WASM is ~52% slower than native)
Best case: 0.18x (WASM can be faster for certain query types!)
Worst case: 5.86x (simple arithmetic queries)
Typical range: 0.4x - 2.0x for most queries

Performance by Query Complexity:

| Query Type | Native | WASM | Overhead | |-----------|--------|------|----------| | Simple arithmetic | 111ms | 649ms | 5.86x | | Boolean logic | 202ms | 89ms | 0.44x ✨ | | Mixed constraints | 170ms | 159ms | 0.93x ✨ | | Complex nested | 257ms | 163ms | 0.64x ✨ | | Real numbers | 267ms | 49ms | 0.18x ✨ | | Large systems (10 vars) | 123ms | 231ms | 1.87x | | UNSAT constraints | 217ms | 153ms | 0.71x ✨ |

✨ = WASM performs better than or nearly equal to native Z3

Run your own benchmarks: npx tsx benchmarks/performance/z3-adapter-comparison.ts

Automatic Fallback (Node.js):

// Tries native Z3 first, falls back to WASM automatically
const adapter = await createZ3Adapter();

// Or prefer WASM over native (useful for consistent behavior across environments)
const adapter = await createZ3Adapter({ preferWasm: true });

Performance Tips:

Use Native Z3 for production (fastest execution)

# macOS
brew install z3

# Linux
sudo apt-get install z3

Use WASM for development/browsers (zero-install convenience)
```
npm install z3-solver  # Optional WASM support
```

Adjust timeout for complex queries

const pot = new ProofOfThought({
  z3Timeout: 60000,  // 60 seconds for complex proofs
});

Parser Performance

The SMT2 parser is optimized for typical reasoning queries:

Small formulas (< 10 variables): < 1ms parsing time
Medium formulas (10-50 variables): 1-5ms parsing time
Large formulas (50-100 variables): 5-20ms parsing time

The parser uses a single-pass tokenizer and minimal allocations. For most reasoning tasks, parsing overhead is negligible compared to Z3 solving time.

Expected Latency

Typical end-to-end latency for a reasoning query:

| Component | Latency | Notes | |-----------|---------|-------| | LLM Translation | 2-5s | GPT-5.1 API call | | SMT2 Parsing | < 10ms | Local, negligible | | Z3 Solving (Native) | 10-500ms | Depends on formula complexity | | Z3 Solving (WASM) | 30-1500ms | 2-3x slower than native | | LLM Explanation | 1-3s | GPT-5.1 API call | | Total | 3-9s | LLM dominates latency |

The bottleneck is typically the LLM API calls, not the Z3 solving.

🧪 Testing

# Run all tests
npm test

# Run tests with coverage
npm run test:coverage

# Run specific test suite
npm test -- tests/backends/smt2-backend.test.ts

# Watch mode
npm run test:watch

🤝 Contributing

I welcome contributions! Please see the Contributing Guide for details.

Development Setup

# Clone repository
git clone https://github.com/MichaelvanLaar/proof-of-thought.git
cd proof-of-thought

# Install dependencies
npm install

# Run tests
npm test

# Build
npm run build

# Lint
npm run lint

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

This proof-of-thought TypeScript library is an unofficial, community-driven port of the original ProofOfThought Python implementation by Debargha Ganguly and colleagues.

Original Paper:

Ganguly, D., et al. (2024). "Proof of Thought: Neurosymbolic Program Synthesis allows Robust and Interpretable Reasoning." arXiv:xxxx.xxxxx

Original Implementation:

Repository: https://github.com/DebarghaG/proofofthought
Authors: Debargha Ganguly and contributors
License: MIT

Key Contributors

Debargha Ganguly - Original concept and Python implementation
Michael van Laar - TypeScript port and enhancements

🔗 Links

Documentation: docs/
Examples: examples/
Benchmarks: benchmarks/
Original Python Version: https://github.com/DebarghaG/proofofthought
Paper: https://arxiv.org/abs/xxxx.xxxxx
Issues: https://github.com/MichaelvanLaar/proof-of-thought/issues
NPM Package: https://www.npmjs.com/package/@michaelvanlaar/proof-of-thought

📞 Support

Issues: GitHub Issues
Discussions: GitHub Discussions
Email: [email protected]

🌟 Star History

If you find proof-of-thought useful, please consider giving it a star ⭐️

📈 Roadmap

[ ] Additional postprocessing methods
[ ] More benchmark datasets
[ ] Performance optimizations
[ ] Extended browser capabilities
[ ] Additional LLM provider support
[ ] Caching and result persistence
[ ] Distributed reasoning capabilities

Built with ❤️ by the proof-of-thought community