js-doc-store-headless
v1.2.0
Published
Headless CMS API powered by js-doc-store. Zero-dependency document database with MongoDB-style queries, JWT auth, dynamic schema generation, and portable deploy packages.
Maintainers
Readme
js-doc-store-headless
Headless CMS API powered by js-doc-store. Zero-dependency document database with MongoDB-style queries, JWT authentication, dynamic schema generation, and portable deployment packages.
Features
- Zero-dependency core: The database engine itself has zero npm dependencies
- Dynamic schemas: Define collections, fields, types, indexes, and relationships on the fly via MCP or REST
- REST API: Full CRUD endpoints generated automatically from schemas
- JWT Authentication: Built-in auth with register, login, logout, role-based access
- MCP Server: One Model Context Protocol server for Codex/Claude integration
- Portable deploy: Export schemas + data to a JSON file, deploy anywhere
- Auto-import: Production server auto-loads exported schemas on startup
- MongoDB-style queries: $eq, $gt, $gte, $lt, $lte, $in, $regex, $and, $or, etc.
- Relationships: Reference fields (ref) between collections with automatic linking
- Validation: Schema-enforced required fields and ref integrity (REST API). Unique constraints via indexes.
Workflow: Local Development -> Production Deploy
1. Local Development (with MCP)
Use the MCP Schema Designer to define your CMS architecture:
# Start MCP server
npx js-doc-store-headless mcp
# Then in Codex/Claude, the LLM can:
# 1. schema_define -> design your blog_cms, task_manager, ecom_store...
# 2. schema_instantiate -> create real collections
# 3. schema_insert / schema_seed -> add content
# 4. schema_query -> test queries2. Export Your Schema
npx js-doc-store-headless export --schema blog_cms --output ./exportsThis creates exports/blog_cms.export.json containing:
- Full schema definition (collections, fields, indexes, relationships)
- All documents (posts, authors, categories, etc.)
- Metadata (version, export timestamp)
3. Deploy to Production
npx js-doc-store-headless deploy --schema blog_cms --output ./deployThis generates a portable deployment package:
deploy/
server.js # Production API server (auto-imports exports/)
js-doc-store.js # Database engine
schema-portable.js # Import logic
exports/
blog_cms.export.json # Schema + data
package.json
.env.example
README.md4. Run Production API
cd deploy
node server.jsThe server automatically imports the schema and data from exports/ on startup. The API is now live at http://localhost:3000.
5. Connect Frontend
// List all posts
fetch('http://localhost:3000/api/blog_cms/posts')
.then(r => r.json())
.then(data => console.log(data.docs));
// Filter posts
fetch('http://localhost:3000/api/blog_cms/posts?published=true&__sort={"createdAt":-1}')
// Create post
fetch('http://localhost:3000/api/blog_cms/posts', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ title: 'Hello', slug: 'hello', content: 'World!' })
});LLM Wiki (Karpathy Pattern)
Persistent, compounding knowledge base maintained by an LLM curator. Instead of rediscovering knowledge on every question via RAG, the wiki accumulates structured understanding across sessions.
Concepts
| Karpathy Pattern | js-doc-store Implementation |
|---|---|
| Raw documents (ground truth) | sources collection with immutable content |
| Markdown pages (middle tier) | pages collection with summary + content |
| index.md (catalog) | wiki.generateIndex() via parent_id hierarchy |
| log.md (event log) | events collection with timestamps |
| LLM as curator | Batch schema_insert / schema_update + db.flush() |
| Ingestion | Parse headings → tree → summaries → insert/merge |
| Querying | TF-IDF or BM25 scoring via persistent inverted index + parent_id traversal |
| Linting | Detect orphans, stale pages, empty roots |
CLI Usage
# Ingest a markdown file into the wiki
npx js-doc-store-headless wiki:ingest ./docs/auth-guide.md
# Search the wiki with TF-IDF scoring
npx js-doc-store-headless wiki:search "jwt rotation"
# Lint: orphans, stale pages, empty categories
npx js-doc-store-headless wiki:lint
# Generate index (like Karpathy's index.md)
npx js-doc-store-headless wiki:indexProgrammatic Usage
const { DocStore, FileStorageAdapter } = require('js-doc-store');
const { WikiEngine } = require('js-doc-store-headless/llm-wiki.js');
const db = new DocStore(new FileStorageAdapter('./wiki-data'));
const wiki = new WikiEngine(db);
// Ingest raw markdown (creates or merges pages)
const result = wiki.ingest(rawMarkdown, 'Auth Guide v1.0');
// { status: 'ingested', sourceSlug: 'auth-guide-v1-0', pagesCreated: 7, pagesMerged: 2 }
// Search with TF-IDF + hierarchical context
const results = wiki.search('token refresh', 5);
// [{ slug, title, score, path: "[L0] Auth → [L1] JWT → [L2] Refresh", contentPreview, tags }]
// Lint the wiki
const report = wiki.lint();
// { orphans: [...], stale: [...], emptyRoots: [...], unlinkedSources: 0, totalPages: 42 }
// Generate index
const index = wiki.generateIndex();
// { "auth": { summary: "...", pages: [{ slug, title, summary, childCount }] } }BM25 Scoring
Switch from TF-IDF to BM25 for better ranking on short documents (summaries):
const wiki = new WikiEngine(db, { scoring: 'bm25', k1: 1.2, b: 0.75 });
const results = wiki.search('jwt rotation', 5);
// Same shape, but scores computed with BM25Custom Summaries (LLM-Generated)
Instead of auto-truncated summaries, provide LLM-written semantic compressions per heading:
wiki.ingest(rawMarkdown, 'Auth Guide v1.0', {
summaries: {
'JWT Basics': 'JWTs are signed tokens that carry claims between parties.',
'Token Refresh': 'Refresh tokens rotate access credentials without re-login.'
}
});Persistent Inverted Index
The wiki maintains an inverted_index collection on disk. This means:
- After a process restart, the first search rebuilds the in-memory cache from disk (no full page scan).
- Incremental updates during
ingest()keep the index current without full rebuilds. - No vector dependencies required.
Design Notes
- Summaries are human embeddings: Each page has a
summaryfield that acts as a semantic compression. The LLM generates these during ingestion. - Merge on re-ingest: If a page with the same slug exists, new content is appended with a
---separator andupdatedAtis bumped. - No vector dependencies: Search uses TF-IDF over
title+summary+tags. For larger corpora (>100K pages), combine withjs-vector-storeBM25. - Git versioned: Use
GitStorageAdapteras the storage backend to version every wiki update.
CLI Commands
# Development
npx js-doc-store-headless mcp # Start MCP Schema Designer
npx js-doc-store-headless api # Start API with JWT auth
npx js-doc-store-headless api:basic # Start API without auth
npx js-doc-store-headless api:prod # Start production API (auto-import)
# Data management
npx js-doc-store-headless list # List all schemas
npx js-doc-store-headless export --schema blog_cms --output ./exports
npx js-doc-store-headless import --schema blog_cms --input ./exports
# Deployment
npx js-doc-store-headless deploy --schema blog_cms --output ./deploy
# Help
npx js-doc-store-headless helpREST API Endpoints
Public (Auth API)
POST /auth/register- Create accountPOST /auth/login- Get JWT tokenGET /auth/me- Current user profilePOST /auth/logout- Invalidate token
Public (Data)
GET /api- List all schemasGET /api/:schema- Schema architecture detailGET /api/:schema/:collection- List documents (with filters)GET /api/:schema/:collection/:id- Get single document
Protected (requires Bearer token)
POST /api/:schema/:collection- Create documentPUT /api/:schema/:collection/:id- Update document (full or partial)PATCH /api/:schema/:collection/:id- Update document (partial, same as PUT)DELETE /api/:schema/:collection/:id- Delete document
Admin
GET /admin/users- List all users (admin role required)
MCP Schema Designer Tools
| Tool | Purpose |
|------|---------|
| schema_define | Design a database architecture |
| schema_exists | Check if schema exists (with skill context) |
| schema_instantiate | Create real collections from schema |
| schema_list | List all schemas with architecture details |
| schema_insert | Insert with validation |
| schema_query | Query with filters, sort, pagination |
| schema_update | Update documents |
| schema_delete | Delete documents |
| schema_seed | Generate sample data |
| schema_aggregate | Run aggregation pipelines |
| schema_export | Export schema + data to portable JSON |
| schema_usage_guide | Get full usage guide |
| auth_register | Register a user (requires AUTH_SECRET) |
| auth_login | Login and get JWT token |
| auth_assign_role | Assign RBAC role (admin only) |
| audit_query | Query audit logs |
| field_encrypt | Encrypt a field value |
| field_decrypt | Decrypt a field value |
| rotate_encryption_key | Rotate encryption key |
| secure_delete | Permanent delete with git purge |
| wiki_ingest | Ingest markdown into the LLM Wiki (Karpathy pattern) |
| wiki_search | Search wiki with TF-IDF + hierarchical context |
| wiki_lint | Lint wiki: orphans, stale pages, empty roots |
| wiki_index | Generate wiki index grouped by root category |
Schema Definition Format
{
"name": "task_manager",
"description": "Task management system",
"collections": [
{
"name": "users",
"fields": [
{ "name": "name", "type": "string", "required": true },
{ "name": "email", "type": "string", "required": true, "unique": true },
{ "name": "role", "type": "string", "default": "user" }
],
"indexes": [{ "field": "email", "unique": true }]
},
{
"name": "tasks",
"fields": [
{ "name": "title", "type": "string", "required": true },
{ "name": "assigneeId", "type": "ref", "refCollection": "users" },
{ "name": "priority", "type": "number", "default": 1 },
{ "name": "status", "type": "string", "default": "todo" }
]
}
]
}Field Types
| Type | Description |
|------|-------------|
| string | Text values |
| number | Numeric values |
| boolean | true/false |
| date | ISO 8601 dates |
| array | Arrays of any values |
| object | Nested objects |
| ref | Reference to another collection's _id |
Query Operators
$eq,$ne,$gt,$gte,$lt,$lte$in,$nin,$exists,$regex$and,$or,$not$contains(arrays),$size(arrays)
Portable Export Format
Exported schemas are pure JSON files containing everything needed to recreate the database:
{
"name": "blog_cms",
"version": "1.0.0",
"exportedAt": "2026-05-15T05:20:01.656Z",
"collections": [
{
"name": "posts",
"definition": { "fields": [...], "indexes": [...] },
"documents": [{ "title": "Hello", "_id": "..." }],
"documentCount": 42
}
]
}These files can be:
- Committed to Git
- Shared between teams
- Deployed to multiple environments
- Versioned alongside your frontend code
Authentication
Uses PBKDF2-SHA256 password hashing and JWT tokens. Configure via environment variables:
JWT_SECRET=your-super-secret-key
PORT=3000
DATA_DIR=./data # Auth and prod servers respect this
EXPORT_DIR=./exportsData Storage
All data persists to the filesystem in JSON files:
schema-designer-data/__schemas.docs.json- Schema definitionsschema-designer-data/<collection>.docs.json- Collection dataschema-designer-data/<collection>.meta.json- Collection metadata
Integration with Codex/Claude
Add the MCP servers to your Codex configuration:
codex mcp add js-doc-store-schema-designer -- node /path/to/schema-designer-server.jsThen your LLM can:
- Design database architectures on demand
- Validate data against schemas
- Query with MongoDB-style filters
- Export and deploy to production
License
MIT
