dar-nlp
v0.1.0
Published
Zero-dependency TF-IDF contract retrieval engine. Finds reusable code from @contract blocks.
Maintainers
Readme
dar-nlp
Zero-dependency TF-IDF contract retrieval engine. Find reusable code from @contract blocks before writing anything new.
dar-nlp find "play text to speech"[nlp-reuse] score=0.642 TTSAdapter
does: Wraps Web Speech API; play() resolves only when speech ENDS.
reuse-when: Any module needs awaitable text-to-speech.
example: await tts.play()Install
npm install -g dar-nlpOr locally via npm link for development:
git clone https://github.com/techiediaries/dar-nlp
cd dar-nlp && npm linkWorkflow
1. Init — tell dar-nlp where your source and output live:
dar-nlp init --src src --contracts contracts.jsonCreates .darnlp.json in the current directory. Run once per project.
2. Extract — scan your JS files for @contract blocks:
dar-nlp extractReads .darnlp.json for the source path. Re-run after adding or editing contracts.
3. Find — search before writing any new code:
dar-nlp find "validate a chess move"
dar-nlp find "render a board position" --top 3Routing
Results come back color-coded with a routing decision:
| Color | Routing | Meaning |
|-------|---------|---------|
| 🟢 Green | nlp-reuse | Strong match — reuse as-is |
| 🟡 Yellow | nlp-verify | Possible match — check before using |
| 🔴 Red | llm-generate | No good match — write new code, then add a @contract |
@contract format
dar-nlp reads standard JSDoc blocks tagged with @contract. The fields it indexes:
/**
* @contract
* @role adapter
* @domain audio
* @does Wraps Web Speech API; play() resolves when speech ENDS, not when it starts.
* @tags tts, speech, audio, promise, await, voice
* @reuse-when Any module needs awaitable text-to-speech without coupling to the DOM.
* @example const tts = new TTSAdapter(); await tts.play('Hello');
* @complexity simple
* @module src/audio/TTSAdapter
*/Key fields for retrieval:
| Field | Weight | Purpose |
|-------|--------|---------|
| @reuse-when | 3× | Plain-English trigger — most important |
| @does | 2× | Semantic description |
| @tags | 2× | Keyword index — comma-separated |
| @role | 1× | Coarse classification |
| @domain | 1× | Namespace filter |
@complexity drives routing: simple → reuse, moderate → verify, complex → always LLM.
API
Use programmatically if you prefer:
const { createSearcher } = require('dar-nlp');
const search = createSearcher('/abs/path/to/contracts.json');
console.log(search.contractCount); // number of indexed contracts
const results = search('render board position', 5);
// [{ contract, score, routing }, ...]contracts.json shape: { contracts: [...] } — produced by dar-nlp extract.
Enforcing the rule in AI coding tools
dar-nlp only works if find runs before code is written — not after. The tool exists; the habit is the hard part. Every major AI coding tool has a project-level instruction file that loads automatically at the start of every session. Put the rule there.
The rule (copy this into any config file)
Before writing any new function, class, or method:
1. Run: dar-nlp find "<what you need>"
2. If result is nlp-reuse (green) — reuse it. Do not write new code.
3. If no match — write the code, then immediately add a @contract block and run dar-nlp extract.
Skipping this defeats the retrieval system.Claude Code — CLAUDE.md
Claude Code loads CLAUDE.md from the project root at the start of every session.
## dar-nlp — run BEFORE writing any new code
Before writing any new function, class, or method in `src/`:
1. Run `dar-nlp find "<what you need>"` first
2. `nlp-reuse` (green) → reuse it, write nothing new
3. No match → write code, add `@contract` block, run `dar-nlp extract`
This is not optional. Skipping it defeats the retrieval system.Global rule (applies to all projects): add to ~/.claude/CLAUDE.md.
Gemini CLI / Antigravity — GEMINI.md
Gemini CLI loads GEMINI.md from the project root.
## dar-nlp contract search
Before writing any new function or module:
- Run: dar-nlp find "<description of what you need>"
- Green (nlp-reuse): reuse the matched code, skip writing
- No match: write code → add @contract block → dar-nlp extractGlobal rule: ~/.gemini/GEMINI.md
Cursor — .cursorrules
Cursor reads .cursorrules from the project root and injects it into every chat and inline edit.
Before writing any new function or class, run:
dar-nlp find "<what you need>"
If the result is nlp-reuse — use it. Do not write a duplicate.
If no match — write the code, add a @contract JSDoc block, run dar-nlp extract.
Never skip this step.GitHub Copilot — .github/copilot-instructions.md
Copilot reads this file and includes it in every suggestion context.
## Code reuse via dar-nlp
Before suggesting any new function or class, the agent should recommend running:
dar-nlp find "<description>"
If nlp-reuse is returned, suggest the existing code instead of generating new.
If no match, generate new code and include a @contract JSDoc block.Windsurf — .windsurfrules
Before writing any new function: run dar-nlp find "<what you need>".
nlp-reuse = use existing code. No match = write + add @contract + dar-nlp extract.Cline / Roo — .clinerules
# dar-nlp rule
Always run `dar-nlp find "<need>"` before implementing anything new.
Reuse nlp-reuse matches. For misses: implement + @contract + dar-nlp extract.Aider — CONVENTIONS.md
Aider reads CONVENTIONS.md (or pass --read CONVENTIONS.md).
## Contract-first rule
Before writing any function: dar-nlp find "<description>".
Reuse if nlp-reuse. Otherwise write + @contract + dar-nlp extract.OpenCode — AGENTS.md
OpenCode loads AGENTS.md from the project root.
## dar-nlp
Run dar-nlp find before implementing. Reuse nlp-reuse results.
Add @contract to any new code, then dar-nlp extract.Universal fallback — any tool that reads a system prompt file
Most AI tools support some form of project-level context. The pattern is always the same — find the file, paste the rule. If your tool isn't listed, check its docs for:
- A file loaded automatically from the project root
- A global config or "custom instructions" field
- A
--system-promptor--contextflag
Paste the rule from the top of this section into whichever mechanism your tool provides.
Why enforcement matters
dar-nlp reduces cold-start token cost only if the retrieval step actually happens. Without enforcement, the agent writes new code by default — the corpus grows, but the retrieval savings never materialise.
The config file is the enforcement mechanism. An agent that has read the rule before it starts coding will reach for dar-nlp find the same way it reaches for a linter or formatter — because the instructions say to, every time, without being reminded.
Why
Every AI-assisted codebase has a cold-start cost: the agent re-reads files it already read last session to re-learn what exists. @contract blocks + dar-nlp find replace that file-reading with a 20-token retrieval query.
Three types of token debt this eliminates:
- Orientation (
dar-nlp find+ a file map) — where do I look? - Interface (contracts.json) — what does each module take and return?
- Reuse-discovery (
@reuse-when+dar-nlp find) — does this already exist?
See Building in the Age of Agents — Article 019 for the full breakdown.
License
MIT
