@pixygon/knowledge-server

v0.1.2

Published

8 days ago

Storage + extraction + chunking + semantic search engine for any text knowledge. Used by @pixygon/chatbot-server for RAG; can also back a wiki/codex search layer.

0High
0Medium
0Low

imakestupidgames

@pixygon/knowledge-server

Storage + extraction + chunking + semantic search for any text knowledge.

+--------------------+      +-------------------------+
|  Host Express app  |----->|     engine.router       |
|  (your auth here)  |      +-------------------------+
+--------------------+               |
                                     v
  KnowledgeDocument · KnowledgeChunk · extractors · embedder · search

Used by:

@pixygon/chatbot-server — calls engine.search() from the RAG pipeline
(planned) Codex / wiki — calls engine.upsertExternal() on entry save to keep an embedding index in sync with the rich domain model

What it does

Documents. Operator pastes text, uploads PDF/DOCX/XLSX/CSV/TXT/MD, points at a URL (scraped once + embedded), or points at a URL marked live (re-fetched at query time).
Extraction. pdf-parse, mammoth, xlsx, @mozilla/readability + jsdom cover the common ingest paths.
Chunking. Paragraph-aware splitter producing ~2 KB chunks with 400-char overlap.
Embedding. Whatever AI client the host passes — text in, vector out.
Search. Cosine-similarity top-K over the embedding index, namespace-scoped.
Namespaces. Multiple knowledge silos per tenant (chatbot, codex, wiki, help-center, …) — search defaults to one namespace at a time, but cross-namespace queries are explicit.
External refs. A knowledge document can be linked back to a host-domain entity (a codex LoreEntity, an LMS Lesson, etc.). engine.upsertExternal() keeps the index in sync as the host model changes.

Install

npm install @pixygon/knowledge-server

Peer expectations:

express ≥ 5
mongoose ≥ 8
Node ≥ 22

Usage

import mongoose from "mongoose";
import { createKnowledge } from "@pixygon/knowledge-server";

// Any object matching { embed(text), chat?({ messages, system }) } works.
// `@pixygon/chatbot-server`'s `createAiClient` returns this shape.
const ai = {
  async embed(text: string) {
    const e = await myEmbeddingClient.embed(text);
    return { embedding: e.vector, tokens: e.tokens };
  },
  async chat({ messages, system }: any) {
    const r = await myChatClient.chat({ messages, system });
    return { content: r.text };
  },
};

const knowledge = createKnowledge({
  mongoose,
  ai,
  tenantField: "tenantId",
  tenantRefName: "Tenant",
  defaultNamespace: "default",
  plugins: [
    (schema, label) => schema.plugin(tenantScopedPlugin, { tenantField: "tenantId", label }),
    (schema, label) => schema.plugin(auditLogPlugin, { entityType: label }),
  ],
});

// Mount the default HTTP router under whatever path the host owns.
app.use("/v1/tenants/:tenantId", verifyToken, tenantAccess, knowledge.router);

// Programmatic search — used by RAG pipelines, codex search, etc.
const hits = await knowledge.search({
  tenantId, query: "fall protection rules", namespace: "chatbot", k: 5,
});

// Codex-style external sync. Idempotent — upsert by (namespace, externalRef).
await knowledge.upsertExternal({
  tenantId,
  namespace: "codex",
  externalModelName: "LoreEntity",
  externalId: loreEntity._id,
  title: loreEntity.name,
  content: loreEntity.description,
  source: `codex/${loreEntity.slug}`,
  tags: loreEntity.tags,
});

HTTP surface

Default router (engine.router):

GET    /knowledge                  ?namespace=&sourceType=
GET    /knowledge/search           ?q=&namespace=&k=
GET    /knowledge/:id
POST   /knowledge                  text       { title, content, source?, namespace?, tags? }
POST   /knowledge/upload           multipart  file, title?, namespace?, tags?
POST   /knowledge/from-url         json       { url, title?, namespace?, extractInstruction?, isLive?, liveDescription?, tags? }
PUT    /knowledge/:id              { title?, content?, source?, tags?, namespace? }
DELETE /knowledge/:id

Companion package

@pixygon/knowledge-react ships the operator UI (list, tabbed upload dialog, RTK Query hooks) — see its README for the React wire-up.

License

MIT.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@pixygon/knowledge-server

What it does

Install

Usage

HTTP surface

Companion package

License