deepthink-js
v1.0.4
Published
SOTA NPM module for agentic processes using local or cloud LLMs.
Readme
Deepthink
An Ollama-powered AI reasoning engine with multi-step thinking, sandboxed code execution, deep web research, autonomous browser control, and self-verifying answer loops.
Table of Contents
- Overview
- Installation
- Quick Start
- Core API —
DeepthinkClass - Research Agent
- Internet Utilities
- Electron Free Explorer
- Advanced Options Reference
- Internal Architecture
- Project Structure
- Requirements & Dependencies
- License
Overview
Deepthink wraps any Ollama-compatible model with a stack of reasoning infrastructure:
- Multi-depth thinking — up to 3 staged internal reasoning passes (analysis → planning → sanity check) before the final response
- Typed output parsing — returns
string,integer,double, orbooleandirectly from free-form model output - Self-verification loops — adversarial and numerical checker agents review responses and drive iterative repair
- Sandboxed code execution — generates and runs JavaScript (Node.js) and Python code in isolated subprocesses to verify numeric answers
- MCTS consensus — runs multiple algorithmic approaches in parallel and votes on the most consistent result
- 9-step deep research pipeline — query planning → crawling → credibility scoring → MMR diversity → fact verification → report writing → critique loops
- Universal URL-to-HTML extractor — fetches and converts HTML, PDF, DOCX, XLSX, PPTX, EPUB, CSV, RTF, ODT, JSON, XML, Markdown, images, and SVG
- Chrome TLS fingerprint spoofing —
impit-backed axios adapter that bypasses bot-detection - Autonomous Electron browser — a free-roaming AI agent that browses the web with human-like mouse events, multi-tab management, and reflective session summaries
- AI code project generator — cognitive planning → code generation → AST validation → sandbox execution → test oracle → iterative repair
Installation
npm install ollama axios cheerio jsdom @mozilla/readability mammoth xlsx jszip fast-xml-parser @iarna/rtf-to-html papaparse marked pdf-parse impitFor the Electron explorer example only:
npm install electron pdf-parseFor Python sandbox support (optional but recommended):
pip install sympyNode.js ≥ 18 is required. The package uses ES Modules — set
"type": "module"in yourpackage.json.
Quick Start
import Deepthink from './thinking/deepthink.js';
const dt = new Deepthink('cogito-2.1:671b-cloud', [], {}, Infinity);
// Simple string generation with one thinking pass
const answer = await dt.generate('What is the capital of France?');
console.log(answer); // "Paris"
// Typed integer output with 2-stage thinking and 2 verification checks
const count = await dt.generate(
'How many prime numbers are less than 50?',
'integer',
2,
2
);
console.log(count); // 15
// Streaming
await dt.generate(
'Explain the Riemann Hypothesis in plain language.',
'string',
1,
0,
(chunk, meta) => process.stdout.write(chunk)
);Core API — Deepthink Class
Constructor
new Deepthink(model, apiKeys, clientOptions, concurrency, auditModel)| Parameter | Type | Default | Description |
|-----------------|------------|-------------------------|-------------|
| model | string | process.env.OLLAMA_MODEL \|\| 'llama3.1' | Primary Ollama model string |
| apiKeys | string[] | [] | Bearer tokens for authenticated Ollama endpoints. Rotated automatically on failure and quarantined for 60 s after 2 consecutive errors |
| clientOptions | object | {} | Passed to the Ollama client. Supports host and headers |
| concurrency | number | Infinity | Maximum simultaneous Ollama requests |
| auditModel | string | same as model | Model used for verification checker agents |
Environment variables:
| Variable | Description |
|--------------------|-------------|
| OLLAMA_HOST | Ollama server base URL (default: http://localhost:11434) |
| OLLAMA_API_KEY | Fallback API key when none are supplied in the constructor |
| OLLAMA_MODEL | Fallback model name |
callChat()
Low-level chat method with automatic retry, streaming fallback, and API key rotation.
const result = await dt.callChat(messages, stream, onChunk, opts);
// result: { content: string, thinking: string }| Parameter | Type | Description |
|------------|------------|-------------|
| messages | Message[]| OpenAI-style [{ role, content }] array |
| stream | boolean | Enable streaming (auto-falls-back to non-streaming on error) |
| onChunk | function | (chunk, { kind: 'content' \| 'thinking' }) => void |
| opts | object | See options reference |
opts keys for callChat:
| Key | Type | Description |
|--------------|-----------|-------------|
| model | string | Override the model for this call |
| think | boolean | Enable the model's internal <think> token if supported |
| format | object | JSON schema for structured output |
| options | object | Ollama generation parameters (temperature, top_p, etc.) |
| keep_alive | string | Ollama keep-alive duration |
Retries up to 3 times with exponential backoff (500 ms → 1 s → 2 s).
generate()
High-level generation with multi-stage thinking, code sandboxing, and self-verification.
const result = await dt.generate(input, type, depth, checks, onChunk, options);| Parameter | Type | Default | Description |
|------------|--------------------------------------------|-------------|-------------|
| input | string \| Message[] \| object | — | The prompt. Accepts a plain string, a messages array, or any object |
| type | 'string' \| 'integer' \| 'double' \| 'boolean' | 'string' | Return type. The raw model text is parsed to this type automatically |
| depth | 0 \| 1 \| 2 \| 3 | 1 | Number of internal thinking stages before the final answer |
| checks | number | 0 | Number of verification checker agents (Standard + Adversarial + Numerical). Up to 3 active at once |
| onChunk | function \| null | null | Streaming callback (chunk, meta) => void |
| options | object | {} | See advanced options |
Depth levels:
| Depth | Stages |
|-------|--------|
| 0 | Direct answer — no pre-thinking |
| 1 | Analysis pass (break down the problem) |
| 2 | Analysis → Planning (formulate a strategy) |
| 3 | Analysis → Planning → Sanity Check (find flaws, edge cases) |
Type parsing rules:
| Type | Parsing behaviour |
|-----------|-------------------|
| string | <think> blocks stripped; full text returned |
| integer | First -?\d+ match extracted and parsed |
| double | First -?\d+(\.\d+)? match extracted |
| boolean | Detects true/yes/1 vs false/no/0; handles negation (e.g. "is not true") |
Code sandboxing (automatic):
When depth > 0 and opts.enableCode !== false, Deepthink detects whether the prompt requires computation. If so, it:
- Generates a mathematical specification (Mathematician agent)
- Implements the spec in both JavaScript and Python (Engineer agent)
- Runs both in isolated subprocesses with blocked dangerous APIs
- Cross-reconciles results; injects the verified answer as ground truth for the LLM
MCTS consensus (automatic when Python is available):
For computational tasks, runs 4 independent algorithmic approaches from different mathematical domains (dynamic programming, combinatorics, group theory, etc.) in parallel Python sandboxes and votes on the most consistent output before falling back to the single-path approach.
Verification loop:
When checks > 0, up to 3 checker agents review the response against the original prompt and, if available, the sandboxed ground truth. Failed checks generate structured feedback. Deepthink iterates repairs up to opts.maxCheckIterations (default: 10) times. A MetacognitiveMonitor watches for response loops and frozen feedback cycles — it breaks out early and returns the best response seen so far when detected.
generateAndRunProject() (via codeGenerator)
Generates a complete, runnable multi-file project from a task description, validates it, tests it, and returns the source files.
import { generateAndRunProject } from './thinking/codeGenerator.js';
const result = await generateAndRunProject(dt.callChat.bind(dt), task, opts);
if (result.success) {
console.log(result.files); // { 'index.js': '...', 'utils.js': '...' }
console.log(result.buildCommands); // 'npm install'
console.log(result.runCommands); // 'node index.js'
}Six-step pipeline:
- Cognitive Planning — AI produces a structured JSON plan: sub-problems, architecture, edge cases, dependency risks, required files, and entry point
- Code Generation — All project files generated in
### FILE:format in a single pass, anchored to the plan - AST Syntax Validation + Sandbox Execution — Each file is checked with
node --check(orpython -m py_compile); runtime errors trigger the fix loop (up tomaxProjectLoopsiterations, default 6) - Test Oracle — An AI-generated test suite (using only Node.js built-ins) is run against the project
- Oracle Fix Loop — Test failures drive targeted
### PATCH:or full### FILE:repairs (up tomaxOracleLoopsiterations, default 3) - Artifact Extraction — Ephemeral sandbox is wiped; clean source files are returned
| Option | Default | Description |
|---------------------|---------|-------------|
| maxProjectLoops | 6 | Max syntax/runtime fix iterations |
| maxOracleLoops | 3 | Max test oracle repair iterations |
| thinkingDepth | 2 | Cognitive planner depth (0–3) |
Research Agent
A 9-step research pipeline that plans search queries, crawls sources, scores credibility, verifies facts, and writes a structured report with APA citations.
import Deepthink from './thinking/deepthink.js';
import runDeepResearch from './thinking/researchAgent.js';
const dt = new Deepthink('cogito-2.1:671b-cloud', []);
const callChat = dt.callChat.bind(dt);
const result = await runDeepResearch(callChat, 'What are the causes of the 2008 financial crisis?', {
maxQueries: 12,
maxConcurrency: 10,
credibilityThreshold: 45,
maxSummaries: 20,
useOllamaSearch: true,
academicFilter: false,
});
console.log(result.report); // Full markdown research report
console.log(result.references); // Array of APA-formatted citations
console.log(result.claimCount); // Number of verified facts used
console.log(result.success); // booleanPipeline steps:
| Step | Name | Description |
|------|------|-------------|
| 0 | Answer Format Detection | Classifies what a correct answer looks like (list, comparison, analysis, etc.) and generates specific search hints |
| 1 | Query Planning | Generates layered search queries at up to 3 recursion depths |
| 2 | Parallel Web Crawling | Fetches all URLs concurrently via extractArticleText |
| 3 | Credibility Scoring | Scores each source (0–100) using TLD signals, low-credibility patterns, URL analysis, and optional academic whitelists/blacklists |
| 4 | MMR Diversity Filter | Maximal Marginal Relevance selects a diverse, high-credibility subset to avoid redundancy |
| 5 | Fact Verification Loop | Each extracted claim is verified against its source content; unverifiable or incorrectly dated claims are corrected or discarded |
| 6 | Report Writing | Synthesises verified claims into a structured markdown report with inline citations |
| 7–9 | Critique & Repair Loop | Domain expert, adversarial, source fidelity, and math/logic critic agents score the report and drive targeted repairs |
Source filtering:
The built-in academic whitelist includes Reuters, AP, BBC, NYT, Nature, Science, arXiv, PubMed, Lancet, NEJM, OECD, World Bank, WHO, and more. The blacklist excludes Wikipedia, Reddit, Quora, and similar low-reliability sources by default. Both lists are fully configurable.
Internet Utilities
Universal Content Extractor
Fetches any URL and returns its content as an HTML string, regardless of format.
import { extractArticleText } from './internet/extractArticleText.js';
const html = await extractArticleText('https://example.com/paper.pdf');Supported formats:
| Format | Library |
|--------|---------|
| HTML / XHTML | cheerio + @mozilla/readability + jsdom |
| PDF | pdf-parse (page-by-page <section> elements) |
| DOCX / DOC | mammoth |
| XLSX / XLS / ODS | SheetJS (xlsx) — all sheets as HTML tables |
| PPTX / PPT | jszip + fast-xml-parser — slide text extraction |
| ODT | jszip + fast-xml-parser |
| EPUB | OPF manifest spine traversal via jszip |
| CSV / TSV | papaparse — rendered as an HTML table |
| JSON | Pretty-printed in <pre> |
| XML | Raw content in <pre> |
| Markdown | marked — rendered to HTML |
| Plain text | Paragraph-split <p> elements |
| RTF | @iarna/rtf-to-html |
| SVG | Inlined directly |
| Images | Base64 data-URI embedded in <img> |
| Unknown/binary | Informational fallback with MIME type and byte count |
MIME type detection falls back to file extension sniffing when servers return application/octet-stream.
Chrome-Fingerprinted Axios Adapter
Drop-in axios replacement that uses impit to perfectly mimic Chrome's TLS fingerprint, bypassing most bot-detection middleware.
import axios from './internet/axios.js';
const response = await axios.get('https://example.com', {
responseType: 'arraybuffer',
timeout: 15000,
});Supports all standard axios options: method, headers, body, timeout, redirect control, and all response types (json, text, arraybuffer, stream).
Ollama Web Search
Wraps Ollama's web search capability with a two-tier fallback and a concurrency queue.
import { getOllamaSearchResults } from './internet/ollamaSearch.js';
const results = await getOllamaSearchResults('Riemann hypothesis latest research', 5);
// results: [{ title, link, snippet, cite }, ...]Tier 1 — REST API with OLLAMA_API_KEY (set via environment variable).
Tier 2 — Ollama JS client, no key required, queued to max 3 simultaneous requests.
Automatically detects whether the installed ollama package accepts { query } object form or plain string form, and caches the result for the session lifetime.
Electron Free Explorer
An autonomous AI browsing agent that explores the web with human-like behaviour.
npm install pdf-parse && npx electron examples/electron_explorer.jsPress Ctrl+C to stop — a reflective journal-style session summary is generated and saved to disk.
Features:
- Up to 3 concurrent browser tabs managed by a
WindowManager; least-productive tabs pruned automatically - Real human-like mouse movement (cubic-eased smooth paths), clicking, triple-click-and-type, scroll, and key press simulation
- Browser audio permanently muted
- Global URL deduplication and URL blacklist (auto-quarantine after 2 failures)
- PDF detection — automatically fetches and injects text as readable HTML instead of loading the binary
- Goal compression every 20 loops — the AI builds and evolves a first-person "current focus" narrative
- Topic drift detection — nudges the AI toward new territory after 8+ loops on the same hostname
- Human emotional state (curious / excited / restless / frustrated / delighted) injected into prompts based on recent failure streaks
- Blank-page detection with immediate escape
- Full action log written to
log.txt; session summaries saved assummary_<timestamp>.txt
Configuration (top of file):
const WINDOW_MODE = 'hidden'; // 'hidden' | 'inactive' | 'normal'
const GOAL = `...`; // Free-form instruction to the AI
const STRATEGY_MODEL = 'cogito-2.1:671b-cloud';
const VISION_MODEL = 'qwen3-vl:235b-instruct-cloud';
const START_URL = 'https://duckduckgo.com';
const MAX_TABS = 3;
const PDF_TIMEOUT = 12000;Advanced Options Reference
generate() Options
These are passed as the options object (6th argument to generate, or opts in callChat).
| Option | Type | Default | Description |
|-------------------------|------------|---------|-------------|
| model | string | constructor model | Override model per-call |
| systemPrompt | string | auto-generated | Custom system prompt |
| autoSystemPrompt | boolean | true | Inject a default system prompt |
| think | boolean | false | Enable model's native <think> token |
| enableCode | boolean | true | Auto-detect and run sandboxed code |
| mcts | boolean | true | Enable MCTS multi-approach consensus |
| mctsNumApproaches | number | 4 | Algorithmic approaches to run in MCTS |
| mctsConsensusThreshold| number | 3 | Minimum agreement count for HIGH confidence |
| analytical | boolean | false | Enable multi-agent analytical decomposition mode |
| humanBrain | boolean | false | Attach a BrainMemory (working + semantic) to the generation context |
| maxCheckIterations | number | 10 | Max self-verification repair iterations |
| monitorWindowSize | number | 5 | MetacognitiveMonitor response history window |
| images | string[] | [] | Base64 image strings attached to the last user message (multimodal) |
| options | object | {} | Raw Ollama generation params (temperature, top_p, num_ctx, etc.) |
| _globalBudget | object | none | { maxLLMCalls: number } — hard cap on total LLM calls across the chain |
researchAgent Options
Passed as the third argument to runDeepResearch(callChat, topic, opts).
| Option | Type | Default | Description |
|--------------------------|------------|---------|-------------|
| maxQueries | number | 12 | Total search queries to plan |
| maxConcurrency | number | 10 | Parallel URL fetch workers |
| credibilityThreshold | number | 45 | Minimum score (0–100) to include a source |
| maxSummaries | number | 20 | Max sources after MMR diversity filter |
| diversityLambda | number | 0.6 | MMR trade-off: 1.0 = pure relevance, 0.0 = pure diversity |
| chunkSize | number | 20 | Claims per report-writing chunk |
| useOllamaSearch | boolean | false | Use Ollama web search (requires key or JS client). Falls back to SearXNG via Mullvad when false |
| academicFilter | boolean | false | Restrict sources to trusted academic/news domains |
| academicWhitelist | string[] | built-in| Additional trusted domains to add (or replace) |
| academicBlacklist | string[] | built-in| Additional domains to block (or replace) |
| academicWhitelistMode | 'extend' \| 'replace' | 'extend' | Whether to extend or replace the default whitelist |
| academicBlacklistMode | 'extend' \| 'replace' | 'extend' | Whether to extend or replace the default blacklist |
| credNegativePatterns | RegExp[] | built-in| URL patterns that reduce credibility score |
| enableCritique | boolean | true | Run critique-and-repair loop (steps 7–9) |
| recursionDepth | number | 2 | Query tree depth (more = broader coverage, more API calls) |
Internal Architecture
Deepthink.generate()
│
├── runThink() ← Multi-stage pre-thinking (think.js)
│ ├── depth ≥ 1 → Analysis pass
│ ├── depth ≥ 2 → Planning pass
│ └── depth ≥ 3 → Sanity-check pass
│
├── detectComputeNeeds() ← Decide: none / single / parallel computation
│
├── generateAndRunCode() ← codeGenerator.js
│ ├── runMCTSApproaches() ← 4 parallel Python sandboxes, consensus vote
│ ├── mathematicianAgent() ← Formal spec (no code)
│ ├── engineerAgent() × 2 ← JS + Python implementations
│ ├── runJSSandbox()
│ ├── runPythonSandbox()
│ └── reconcileResults()
│
├── callChat() ← Final answer with ground truth injected
│
└── runChecks() × N ← Self-verification loop
├── Standard checker
├── Adversarial checker
└── Numerical checker
└── MetacognitiveMonitor ← Break loops, return best responserunDeepResearch()
│
├── Step 0 detectAnswerFormat()
├── Step 1 plannerAgent() ← Hierarchical query tree
├── Step 2 crawlerAgent() ← Parallel fetch via extractArticleText
├── Step 3 verificationAgent() ← Credibility scoring (scoreCredibility)
├── Step 4 extractWithFallback() ← Summarisation + MMR diversity (applyMMR)
├── Step 5 factVerificationLoop() ← Claim-level fact checking
├── Step 6 reportWriterAgent() ← Chunked report generation + APA citations
└── Steps 7–9 critiqueAndRepairLoop()
├── sourceFidelityVerifier
├── mathLogicVerifier
├── domainExpertCritic
├── adversarialCritic
└── constrainedRepairAgentProject Structure
thinking/
├── deepthink.js Main Deepthink class (generate, callChat, verification)
├── researchAgent.js 9-step deep research pipeline
├── codeGenerator.js Project generation, MCTS, sandboxes, test oracle
├── analytical.js Multi-agent analytical decomposition mode
├── think.js Multi-stage pre-thinking passes
└── dataTypes.js Type parsing, message normalisation utilities
internet/
├── extractArticleText.js Universal URL → HTML extractor (20+ formats)
├── axios.js Chrome TLS-spoofing axios adapter (impit)
├── ollamaSearch.js Ollama web search (REST + JS client fallback)
├── interactWithInternet.js Search + fetch orchestration layer
└── extractCitation.js APA citation generator
examples/
├── electron_explorer.js Autonomous AI browser agent (Electron)
└── research.js Standalone deep research usage exampleRequirements & Dependencies
| Dependency | Purpose |
|---|---|
| ollama | Ollama JS client |
| axios | HTTP client (wrapped by the impit adapter) |
| impit | Chrome TLS fingerprint for bot bypass |
| cheerio | HTML parsing and cleaning |
| jsdom | DOM for Readability |
| @mozilla/readability | Article extraction from HTML |
| mammoth | DOCX → HTML conversion |
| xlsx | Spreadsheet (XLSX/XLS/ODS) parsing |
| jszip | PPTX, ODT, EPUB unpacking |
| fast-xml-parser | XML/PPTX slide text extraction |
| @iarna/rtf-to-html | RTF → HTML |
| papaparse | CSV/TSV parsing |
| marked | Markdown → HTML |
| pdf-parse | PDF text extraction |
| electron | (examples/electron_explorer.js only) |
| python3 + sympy | Python sandbox and symbolic math verification (optional) |
Node.js ≥ 18 is required for native fetch, ReadableStream, and top-level await.
License
MIT
