js-doc-store-headless

v1.2.0

Published

2 months ago

Headless CMS API powered by js-doc-store. Zero-dependency document database with MongoDB-style queries, JWT auth, dynamic schema generation, and portable deploy packages.

0High
0Medium
0Low

rckflr

headless-cms document-database nosql mongodb rest-api jwt mcp schema-designer vanilla-js deploy portable

js-doc-store-headless

Headless CMS API powered by js-doc-store. Zero-dependency document database with MongoDB-style queries, JWT authentication, dynamic schema generation, and portable deployment packages.

Features

Zero-dependency core: The database engine itself has zero npm dependencies
Dynamic schemas: Define collections, fields, types, indexes, and relationships on the fly via MCP or REST
REST API: Full CRUD endpoints generated automatically from schemas
JWT Authentication: Built-in auth with register, login, logout, role-based access
MCP Server: One Model Context Protocol server for Codex/Claude integration
Portable deploy: Export schemas + data to a JSON file, deploy anywhere
Auto-import: Production server auto-loads exported schemas on startup
MongoDB-style queries: $eq, $gt, $gte, $lt, $lte, $in, $regex, $and, $or, etc.
Relationships: Reference fields (ref) between collections with automatic linking
Validation: Schema-enforced required fields and ref integrity (REST API). Unique constraints via indexes.

Workflow: Local Development -> Production Deploy

1. Local Development (with MCP)

Use the MCP Schema Designer to define your CMS architecture:

# Start MCP server
npx js-doc-store-headless mcp

# Then in Codex/Claude, the LLM can:
# 1. schema_define -> design your blog_cms, task_manager, ecom_store...
# 2. schema_instantiate -> create real collections
# 3. schema_insert / schema_seed -> add content
# 4. schema_query -> test queries

2. Export Your Schema

npx js-doc-store-headless export --schema blog_cms --output ./exports

This creates exports/blog_cms.export.json containing:

Full schema definition (collections, fields, indexes, relationships)
All documents (posts, authors, categories, etc.)
Metadata (version, export timestamp)

3. Deploy to Production

npx js-doc-store-headless deploy --schema blog_cms --output ./deploy

This generates a portable deployment package:

deploy/
  server.js          # Production API server (auto-imports exports/)
  js-doc-store.js    # Database engine
  schema-portable.js # Import logic
  exports/
    blog_cms.export.json  # Schema + data
  package.json
  .env.example
  README.md

4. Run Production API

cd deploy
node server.js

The server automatically imports the schema and data from exports/ on startup. The API is now live at http://localhost:3000.

5. Connect Frontend

// List all posts
fetch('http://localhost:3000/api/blog_cms/posts')
  .then(r => r.json())
  .then(data => console.log(data.docs));

// Filter posts
fetch('http://localhost:3000/api/blog_cms/posts?published=true&__sort={"createdAt":-1}')

// Create post
fetch('http://localhost:3000/api/blog_cms/posts', {
  method: 'POST',
  headers: { 'Content-Type': 'application/json' },
  body: JSON.stringify({ title: 'Hello', slug: 'hello', content: 'World!' })
});

LLM Wiki (Karpathy Pattern)

Persistent, compounding knowledge base maintained by an LLM curator. Instead of rediscovering knowledge on every question via RAG, the wiki accumulates structured understanding across sessions.

Concepts

| Karpathy Pattern | js-doc-store Implementation | |---|---| | Raw documents (ground truth) | sources collection with immutable content | | Markdown pages (middle tier) | pages collection with summary + content | | index.md (catalog) | wiki.generateIndex() via parent_id hierarchy | | log.md (event log) | events collection with timestamps | | LLM as curator | Batch schema_insert / schema_update + db.flush() | | Ingestion | Parse headings → tree → summaries → insert/merge | | Querying | TF-IDF or BM25 scoring via persistent inverted index + parent_id traversal | | Linting | Detect orphans, stale pages, empty roots |

CLI Usage

# Ingest a markdown file into the wiki
npx js-doc-store-headless wiki:ingest ./docs/auth-guide.md

# Search the wiki with TF-IDF scoring
npx js-doc-store-headless wiki:search "jwt rotation"

# Lint: orphans, stale pages, empty categories
npx js-doc-store-headless wiki:lint

# Generate index (like Karpathy's index.md)
npx js-doc-store-headless wiki:index

Programmatic Usage

const { DocStore, FileStorageAdapter } = require('js-doc-store');
const { WikiEngine } = require('js-doc-store-headless/llm-wiki.js');

const db = new DocStore(new FileStorageAdapter('./wiki-data'));
const wiki = new WikiEngine(db);

// Ingest raw markdown (creates or merges pages)
const result = wiki.ingest(rawMarkdown, 'Auth Guide v1.0');
// { status: 'ingested', sourceSlug: 'auth-guide-v1-0', pagesCreated: 7, pagesMerged: 2 }

// Search with TF-IDF + hierarchical context
const results = wiki.search('token refresh', 5);
// [{ slug, title, score, path: "[L0] Auth → [L1] JWT → [L2] Refresh", contentPreview, tags }]

// Lint the wiki
const report = wiki.lint();
// { orphans: [...], stale: [...], emptyRoots: [...], unlinkedSources: 0, totalPages: 42 }

// Generate index
const index = wiki.generateIndex();
// { "auth": { summary: "...", pages: [{ slug, title, summary, childCount }] } }

BM25 Scoring

Switch from TF-IDF to BM25 for better ranking on short documents (summaries):

const wiki = new WikiEngine(db, { scoring: 'bm25', k1: 1.2, b: 0.75 });
const results = wiki.search('jwt rotation', 5);
// Same shape, but scores computed with BM25

Custom Summaries (LLM-Generated)

Instead of auto-truncated summaries, provide LLM-written semantic compressions per heading:

wiki.ingest(rawMarkdown, 'Auth Guide v1.0', {
  summaries: {
    'JWT Basics': 'JWTs are signed tokens that carry claims between parties.',
    'Token Refresh': 'Refresh tokens rotate access credentials without re-login.'
  }
});

Persistent Inverted Index

The wiki maintains an inverted_index collection on disk. This means:

After a process restart, the first search rebuilds the in-memory cache from disk (no full page scan).
Incremental updates during ingest() keep the index current without full rebuilds.
No vector dependencies required.

Design Notes

Summaries are human embeddings: Each page has a summary field that acts as a semantic compression. The LLM generates these during ingestion.
Merge on re-ingest: If a page with the same slug exists, new content is appended with a --- separator and updatedAt is bumped.
No vector dependencies: Search uses TF-IDF over title + summary + tags. For larger corpora (>100K pages), combine with js-vector-store BM25.
Git versioned: Use GitStorageAdapter as the storage backend to version every wiki update.

CLI Commands

# Development
npx js-doc-store-headless mcp              # Start MCP Schema Designer
npx js-doc-store-headless api              # Start API with JWT auth
npx js-doc-store-headless api:basic        # Start API without auth
npx js-doc-store-headless api:prod         # Start production API (auto-import)

# Data management
npx js-doc-store-headless list             # List all schemas
npx js-doc-store-headless export --schema blog_cms --output ./exports
npx js-doc-store-headless import --schema blog_cms --input ./exports

# Deployment
npx js-doc-store-headless deploy --schema blog_cms --output ./deploy

# Help
npx js-doc-store-headless help

REST API Endpoints

Public (Auth API)

POST /auth/register - Create account
POST /auth/login - Get JWT token
GET /auth/me - Current user profile
POST /auth/logout - Invalidate token

Public (Data)

GET /api - List all schemas
GET /api/:schema - Schema architecture detail
GET /api/:schema/:collection - List documents (with filters)
GET /api/:schema/:collection/:id - Get single document

Protected (requires Bearer token)

POST /api/:schema/:collection - Create document
PUT /api/:schema/:collection/:id - Update document (full or partial)
PATCH /api/:schema/:collection/:id - Update document (partial, same as PUT)
DELETE /api/:schema/:collection/:id - Delete document

Admin

GET /admin/users - List all users (admin role required)

MCP Schema Designer Tools

| Tool | Purpose | |------|---------| | schema_define | Design a database architecture | | schema_exists | Check if schema exists (with skill context) | | schema_instantiate | Create real collections from schema | | schema_list | List all schemas with architecture details | | schema_insert | Insert with validation | | schema_query | Query with filters, sort, pagination | | schema_update | Update documents | | schema_delete | Delete documents | | schema_seed | Generate sample data | | schema_aggregate | Run aggregation pipelines | | schema_export | Export schema + data to portable JSON | | schema_usage_guide | Get full usage guide | | auth_register | Register a user (requires AUTH_SECRET) | | auth_login | Login and get JWT token | | auth_assign_role | Assign RBAC role (admin only) | | audit_query | Query audit logs | | field_encrypt | Encrypt a field value | | field_decrypt | Decrypt a field value | | rotate_encryption_key | Rotate encryption key | | secure_delete | Permanent delete with git purge | | wiki_ingest | Ingest markdown into the LLM Wiki (Karpathy pattern) | | wiki_search | Search wiki with TF-IDF + hierarchical context | | wiki_lint | Lint wiki: orphans, stale pages, empty roots | | wiki_index | Generate wiki index grouped by root category |

Schema Definition Format

{
  "name": "task_manager",
  "description": "Task management system",
  "collections": [
    {
      "name": "users",
      "fields": [
        { "name": "name", "type": "string", "required": true },
        { "name": "email", "type": "string", "required": true, "unique": true },
        { "name": "role", "type": "string", "default": "user" }
      ],
      "indexes": [{ "field": "email", "unique": true }]
    },
    {
      "name": "tasks",
      "fields": [
        { "name": "title", "type": "string", "required": true },
        { "name": "assigneeId", "type": "ref", "refCollection": "users" },
        { "name": "priority", "type": "number", "default": 1 },
        { "name": "status", "type": "string", "default": "todo" }
      ]
    }
  ]
}

Field Types

| Type | Description | |------|-------------| | string | Text values | | number | Numeric values | | boolean | true/false | | date | ISO 8601 dates | | array | Arrays of any values | | object | Nested objects | | ref | Reference to another collection's _id |

Query Operators

$eq, $ne, $gt, $gte, $lt, $lte
$in, $nin, $exists, $regex
$and, $or, $not
$contains (arrays), $size (arrays)

Portable Export Format

Exported schemas are pure JSON files containing everything needed to recreate the database:

{
  "name": "blog_cms",
  "version": "1.0.0",
  "exportedAt": "2026-05-15T05:20:01.656Z",
  "collections": [
    {
      "name": "posts",
      "definition": { "fields": [...], "indexes": [...] },
      "documents": [{ "title": "Hello", "_id": "..." }],
      "documentCount": 42
    }
  ]
}

These files can be:

Committed to Git
Shared between teams
Deployed to multiple environments
Versioned alongside your frontend code

Authentication

Uses PBKDF2-SHA256 password hashing and JWT tokens. Configure via environment variables:

JWT_SECRET=your-super-secret-key
PORT=3000
DATA_DIR=./data          # Auth and prod servers respect this
EXPORT_DIR=./exports

Data Storage

All data persists to the filesystem in JSON files:

schema-designer-data/__schemas.docs.json - Schema definitions
schema-designer-data/<collection>.docs.json - Collection data
schema-designer-data/<collection>.meta.json - Collection metadata

Integration with Codex/Claude

Add the MCP servers to your Codex configuration:

codex mcp add js-doc-store-schema-designer -- node /path/to/schema-designer-server.js

Then your LLM can:

Design database architectures on demand
Validate data against schemas
Query with MongoDB-style filters
Export and deploy to production

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

js-doc-store-headless

Features

Workflow: Local Development -> Production Deploy

1. Local Development (with MCP)

2. Export Your Schema

3. Deploy to Production

4. Run Production API

5. Connect Frontend

LLM Wiki (Karpathy Pattern)

Concepts

CLI Usage

Programmatic Usage

BM25 Scoring

Custom Summaries (LLM-Generated)

Persistent Inverted Index

Design Notes

CLI Commands

REST API Endpoints

Public (Auth API)

Public (Data)

Protected (requires Bearer token)

Admin

MCP Schema Designer Tools

Schema Definition Format

Field Types

Query Operators

Portable Export Format

Authentication

Data Storage

Integration with Codex/Claude

License