@black-knight.dev/emet
v1.1.0
Published
Zero-setup grounded web research for AI coding agents.
Readme
emet

The Zero-Setup Research Engine for Autonomous AI Agents.
emet is an advanced grounding tool designed specifically for AI coding agents. It prevents agents from hallucinating API endpoints, guessing library versions, or inventing CVE details by injecting real-time, highly authoritative, and conflict-resolved web research directly into their context window.


💡 Why emet?
The world does not need just another "AI Search Engine"—there are plenty of massive, standalone research tools out there.
Instead, emet was built specifically to solve a crucial problem in the Agentic Workflow: When an autonomous agent is deep in a coding loop, compiling errors, or debugging, it needs hard facts instantly without losing focus. Calling out to heavy external search services or trying to execute brittle Playwright scripts breaks the agent's flow, wastes context window tokens, and leads to hallucinations.
emet solves this by providing a lightweight, internal cognitive research loop directly into the agent harness:
- Agent-Centric Routing: It knows exactly where developers look (GitHub, NPM, NIST, arXiv).
- Authority First: It prioritizes official documentation over random SEO-optimized tutorials.
- Self-Awareness: It extracts structured features to know when it lacks information, safely triggering follow-up questions before returning an answer to the agent.
Best of all? Zero setup. No external search API keys to configure, no heavy local LLMs to run, and no flaky browser automation scripts to maintain. It's built to run silently and reliably alongside your agent.
✨ Features
- 🚀 Lightning Fast: Powered by a Hybrid Tiny-Router Architecture (Model2Vec + SVC), routing queries in < 0.6 milliseconds.
- 🛡️ Anti-Hallucination: Built-in Veto-Power for high-risk queries. If a security question only finds blog posts, the system forces a follow-up to find authoritative NIST/CVE data.
- 🕸️ Resilient Fetching: Pre-emptively escalates blocked, JS-heavy, or thin pages through an integrated, robust Python
Scraplingdaemon (via IPC JSON-RPC 2.0). - 🧩 Domain Packs: Built-in heuristics for
github,security,papers,package-registry, and more. - 📊 Structured Outputs: Returns citations, code blocks, missing aspects, confidence scores, and conflict summaries (e.g., "Source A contradicts Source B").
- 📂 Local Context: Ingests local files (
options.files) to ground web research in your current repository context.
📦 Installation
Pi Coding Agent (Extension)
If you are using the Pi Agent harness, the Pi install path stays the same:
pi install npm:@black-knight.dev/emetPi compatibility note: the public Pi extension contract is unchanged. It still exports extensions/emet.ts, registers the emet tool, and keeps the same modes (fast, deep, code, academic). The new host-profile layer only affects MCP server surfaces.
How to install the MCP server
For MCP hosts, prefer a real binary over npx. emet ships bin aliases (emet, emet-mcp), and explicit installation is the most reliable cross-host setup.
Recommended global install
npm install -g @black-knight.dev/emet
emetProject-local install
npm install @black-knight.dev/emet
node ./node_modules/@black-knight.dev/emet/emet.jsLocal dev / repo checkout
node ./mcp/server.jsThe MCP server identifies itself as emet-mcp and exposes the tool emet.
Host install matrix
| Host | Install command | Config example |
| --- | --- | --- |
| Claude Code | npm install -g @black-knight.dev/emet or plugin marketplace install | configs/claude-code/mcp.json |
| Cursor | npm install -g @black-knight.dev/emet | configs/cursor/mcp.json |
| VS Code / Copilot | npm install -g @black-knight.dev/emet | configs/vscode-copilot/mcp.json |
| Codex | npm install -g @black-knight.dev/emet or plugin marketplace install | configs/codex/config.toml |
| Gemini CLI | npm install -g @black-knight.dev/emet | configs/gemini/settings.json |
Verified in real CLIs: Claude Code, Codex, and Gemini CLI were tested with temporary installs. Claude and Codex connected successfully; Gemini registered successfully and was then gated by workspace trust, which is expected behavior.
Host-specific setup
Ready-to-copy examples live in configs/.
MCP install
npm install -g @black-knight.dev/emet
claude mcp add emet -- emetProject config
- Copy
configs/claude-code/mcp.jsonto.mcp.json
Plugin / marketplace files
Marketplace install
claude plugin marketplace add https://github.com/endgegnerbert-tech/emet
claude plugin install emet@emetLocal marketplace install (repo checkout)
claude plugin marketplace add .
claude plugin install emet@emetThe Claude marketplace plugin runs the repo-bundled MCP entrypoint directly, so it does not require a separate global emet install.
Verify
claude mcp list
claude mcp get emetFor direct claude mcp add ..., expect emet with Status: ✓ Connected.
For marketplace installs, Claude prefixes the server name, so claude mcp list should show plugin:emet:emet as connected.
Local marketplace validation
claude plugin validate ./.claude-plugin/plugin.jsonRepo checkout alternative
claude mcp add emet -- node ./mcp/server.jsInstall
npm install -g @black-knight.dev/emetCopy configs/cursor/mcp.json to .cursor/mcp.json:
{
"mcpServers": {
"emet": {
"command": "emet"
}
}
}Verify
Restart Cursor and confirm emet appears in MCP settings/tools and exposes the emet tool.
Install
npm install -g @black-knight.dev/emetCopy configs/vscode-copilot/mcp.json to .vscode/mcp.json:
{
"servers": {
"emet": {
"type": "stdio",
"command": "emet"
}
}
}Verify
Restart VS Code and confirm the MCP server is available from Copilot Chat and exposes the emet tool.
MCP install
npm install -g @black-knight.dev/emetMerge configs/codex/config.toml into ~/.codex/config.toml or your project-level Codex config:
[mcp_servers.emet]
command = "emet"Plugin / marketplace files
./.codex-plugin/plugin.json./.codex-plugin/mcp.json./.agents/plugins/marketplace.json./plugins/emet(local marketplace source path)
Marketplace install
codex plugin marketplace add https://github.com/endgegnerbert-tech/emet
codex plugin add emet@emetLocal marketplace install (repo checkout)
codex plugin marketplace add .
codex plugin add emet@emetThe Codex marketplace plugin bootstraps @black-knight.dev/emet on first run from the plugin bundle in plugins/emet, so it does not require a separate global emet install.
Verify
codex mcp list
codex mcp get emetExpected: enabled: true and transport: stdio.
Marketplace usage
Codex reads local plugin marketplaces from .agents/plugins/marketplace.json. This repo now ships that file plus a dedicated plugins/emet bundle for local marketplace workflows and future marketplace packaging.
Install
npm install -g @black-knight.dev/emetMerge configs/gemini/settings.json into ~/.gemini/settings.json or .gemini/settings.json:
{
"mcpServers": {
"emet": {
"command": "emet"
}
}
}Verify
gemini mcp listExpected: emet should appear in the configured MCP server list. In untrusted folders Gemini may show the server as configured but disabled until the workspace is trusted.
🚀 Quick Start / Usage
Once installed, your agent has access to the emet tool. It accepts a query, a mode, and various options.
Modes
| Mode | Best for |
| --- | --- |
| fast | Quick factual lookups (e.g., "What is the latest LTS version of Node.js?"). Stops fetching early if authoritative sources are found. |
| deep | Broader retrieval with automatic follow-up rounds. Perfect for comparisons, conflicts, or unclear architecture questions. |
| code | Docs, repositories, README-driven answers, and retrieving actual code snippets. |
| academic | Scholarly sources, DOI links, and paper-heavy topics. |
Example Tool Calls (For Agents)
Factual Lookup:
{
"query": "React 19 RC release notes",
"mode": "fast",
"options": { "requireAuthoritative": true }
}Architecture Research:
{
"query": "Compare PostgreSQL and MySQL for multi-tenant SaaS",
"mode": "deep",
"options": { "preferRecent": true, "maxTurns": 2 }
}🧠 Under the Hood: The Agentic Router Update (v1.4.0)
With 1.4.0, emet shifted from heavy, generative JSON-planners to a Hybrid Tiny-Router Architecture.
- Model2Vec & SVC: Queries are classified via locally embedded features. Security and paper queries have a 0% downgrade rate.
- Structured ML: Instead of asking a heavy LLM "Is this enough data?", the system extracts deterministic features (
has_authority,conflict_state) and uses an ultra-fast Logistic Regression model to evaluate sufficiency and follow-up actions with 100% evaluated accuracy. - Node.js-to-Python IPC: Operates entirely locally using a highly optimized, line-delimited JSON-RPC daemon to manage Python dependencies (
Scrapling,Model2Vec) without memory leaks.
🛣️ Future Roadmap
We are actively working on scaling the reasoning capabilities:
- LLM Data Augmentation (Weak Supervision): Generating synthetic training data for underconfident domains to boost zero-shot accuracy to >95% without manual labeling.
- Active Learning Telemetry Loop: Clustering low-confidence predictions from cache logs into a weakly-supervised retraining pipeline to let the system "self-heal."
- Cross-Encoder for Conflict Detection: Transitioning to a fine-tuned Cross-Encoder (e.g., MiniLM + Natural Language Inference) to detect deep semantic contradiction across differing texts (e.g., recognizing that "Node 20 is stable" contradicts "Node 20 is broken").
📝 License & Notices
- License: MIT
- Third-party notices: See
THIRD_PARTY_NOTICES.md - GitHub: https://github.com/endgegnerbert-tech/emet
