@dosymbek/qcoreai-client

v1.0.0

Published

a month ago

TypeScript client for AEVION QCoreAI multi-agent pipeline — sync, streaming, refine, tags, eval harness, prompts library, threading, templates, batch runs, scheduled batches, workspaces, custom pipelines, notebook collections, run insights, cost breakdown

0High
0Medium
0Low

dosymbek

aevion qcoreai multi-agent llm anthropic openai gemini deepseek grok sse webhook

@aevion/qcoreai-client

TypeScript client for AEVION QCoreAI — a multi-agent LLM pipeline with sequential / parallel / debate strategies, mid-run human guidance, hard cost caps, run tagging and signed webhooks.

Single-file SDK (~300 LOC). No runtime deps. Works in Node 18+ and modern browsers / Edge.

npm install @aevion/qcoreai-client

Quick start

import { QCoreClient } from "@aevion/qcoreai-client";

const client = new QCoreClient({
  baseUrl: "https://api.aevion.app",
  token: process.env.AEVION_TOKEN, // optional, required for owner endpoints
});

// 1. Sync — collect the whole stream into a final answer.
const result = await client.runSync({
  input: "Compare Postgres vs DynamoDB for an event-sourced ledger.",
  strategy: "sequential", // | "parallel" | "debate"
  maxCostUsd: 0.10,
});

console.log(result.finalContent);
console.log("Cost:", result.totalCostUsd, "Run:", result.runId);

Streaming

for await (const evt of client.runStream({
  input: "Plan a 30-day onboarding for a B2B SaaS",
  strategy: "debate",
})) {
  if (evt.type === "agent_chunk") process.stdout.write(evt.delta);
  if (evt.type === "run_complete") {
    console.log("\n[done]", evt.totalCostUsd, evt.totalDurationMs + "ms");
  }
}

Every event matches the server's OrchestratorEvent union — see src/index.ts for the exhaustive type. Common types include:

session { sessionId, runId } — emitted first; capture for downstream API calls
agent_start { role, stage, instance?, provider, model }
agent_chunk { role, stage, delta } — token deltas
agent_end { role, stage, tokensIn, tokensOut, durationMs, costUsd, content }
verdict { approved, feedback } — sequential strategy critic verdict
guidance_applied { nextRole, nextStage, text } — mid-run user steer landed
cost_cap_hit { spentUsd, capUsd } — hard cap crossed, run finalised early
run_complete { finalContent, status, totalCostUsd, totalDurationMs }
error { message }

WebSocket duplex (mid-run guidance on the same connection)

const session = client.runWS({
  input: "Plan a 30-day onboarding for a B2B SaaS",
  strategy: "debate",
});

// Steer mid-run from a separate event handler:
setTimeout(() => session.interject("Add a TL;DR section at the top"), 3000);

for await (const evt of session.events) {
  if (evt.type === "chunk") process.stdout.write(evt.text);
  if (evt.type === "guidance_applied") console.log("\n[steered]", evt.text);
}

In Node 22+ WebSocket is a global. For older Node:

import { WebSocket } from "ws";
const session = client.runWS({ input: "...", WebSocketImpl: WebSocket as any });

Server endpoint: /api/qcoreai/ws. Auth via ?token=<JWT>. Rate limit 30 upgrades / minute / IP. 64 KB max message size, 8 pending guidance × 4 KB.

Refining a run

// Apply a one-pass surgical edit on top of an already-finished run.
await client.refine(runId, "Add a TL;DR section at the top.");

Tags + search

await client.setTags(runId, ["investor-deck", "ledger-research"]);

// Substring search across input/finalContent/session.title/tags.
const hits = await client.search("ledger");
hits.forEach((h) => console.log(h.matched, h.preview));

// Top tags ranked by count — drives the sidebar chip strip.
const top = await client.topTags(15);

Daily timeseries (cost forecasting)

const series = await client.timeseries(30);
// series: [{ date: "2026-04-22", runs: 4, costUsd: 0.123 }, ...]

Agent marketplace

// 1. Publish a preset.
const { id } = await client.sharePreset({
  name: "Investor pitch lineup",
  description: "Sequential — Sonnet writer + Haiku critic",
  strategy: "sequential",
  overrides: { writer: { provider: "anthropic", model: "claude-sonnet-4-20250514" } },
});

// 2. Browse what others have published.
const top = await client.browsePresets();
const pitchPresets = await client.browsePresets("investor");

// 3. Import to bump the importCount + use locally.
const imported = await client.importPreset(top[0].id);
console.log("Got preset:", imported.name, imported.strategy, imported.overrides);

// 4. (Owner) delete one of your shared presets.
await client.deletePreset(id);

Eval harness

Track quality regressions by running a fixed suite of test cases through your multi-agent pipeline. Each case has an input prompt and a judge (contains / not_contains / equals / regex / min_length / max_length). The runner aggregates a 0..1 weighted score so you can chart it over time.

// 1. Create a suite.
const suite = await client.createEvalSuite({
  name: "Onboarding writer regression",
  description: "Catch days where the writer drops the TL;DR section",
  strategy: "sequential",
  cases: [
    {
      id: "c1",
      name: "Has TL;DR",
      input: "Plan a 30-day onboarding for a B2B SaaS",
      judge: { type: "contains", needle: "TL;DR", caseSensitive: false },
    },
    {
      id: "c2",
      name: "Min length",
      input: "Plan a 30-day onboarding for a B2B SaaS",
      judge: { type: "min_length", chars: 800 },
    },
    {
      id: "c3",
      name: "No banned phrasing",
      input: "Plan a 30-day onboarding for a B2B SaaS",
      judge: { type: "not_contains", needle: "as a large language model" },
    },
    {
      id: "c4",
      name: "Tone is friendly + actionable",
      input: "Plan a 30-day onboarding for a B2B SaaS",
      judge: {
        type: "llm_judge",
        rubric: "The output must read as a friendly senior PM giving concrete week-by-week actions.",
        passThreshold: 0.7,
      },
    },
  ],
});

// 2. Run it (and wait for completion).
const result = await client.runEvalSuiteAndWait(suite.id, {
  concurrency: 3,
  perCaseMaxCostUsd: 0.05,
  timeoutMs: 5 * 60_000,
});

console.log(`Score: ${(result.score! * 100).toFixed(1)}%`);
for (const r of result.results) {
  console.log(`${r.passed ? "✔" : "✘"} ${r.caseName} — ${r.reason}`);
}

// 3. Track regressions over time.
const history = await client.listSuiteRuns(suite.id, 30);
const trend = history.filter((r) => r.status === "done").map((r) => r.score);
console.log("Last 30 scores:", trend);

Or kick off a run without blocking and poll yourself:

const run = await client.runEvalSuite(suite.id);
while (run.status === "running") {
  await new Promise((r) => setTimeout(r, 1500));
  Object.assign(run, await client.getEvalRun(run.id));
  console.log(`progress: ${run.results.length}/${run.totalCases}`);
}

Per-user webhooks

Configure a personal webhook that receives run.completed events with HMAC signatures.

await client.setUserWebhook(
  "https://your-receiver.example.com/qcore",
  "any-strong-shared-secret"
);

The server POSTs a JSON payload to that URL with two headers:

X-QCore-Signature: <hex HMAC-SHA256 of body using your secret>
X-QCore-Origin: env | user

Verify it on your receiver:

import { verifyWebhookHmac } from "@aevion/qcoreai-client";
import express from "express";

const app = express();

app.post("/qcore-webhook", express.raw({ type: "*/*" }), async (req, res) => {
  const ok = await verifyWebhookHmac(
    req.body,
    req.headers["x-qcore-signature"],
    process.env.QCORE_WEBHOOK_SECRET!
  );
  if (!ok) return res.status(401).end();
  const evt = JSON.parse(req.body.toString("utf8"));
  console.log("run.completed", evt.runId, evt.status, evt.totalCostUsd);
  res.json({ ok: true });
});

verifyWebhookHmac uses Web Crypto SubtleCrypto + constant-time comparison. Works in Node 18+, Cloudflare Workers, Vercel Edge.

API reference

| Method | HTTP | Notes | |---|---|---| | runSync(opts) | POST /api/qcoreai/multi-agent | Buffers stream into RunSyncResult | | runStream(opts) | POST /api/qcoreai/multi-agent | Async generator of OrchestratorEvent | | runWS(opts) | WS /api/qcoreai/ws | Duplex: events + interject(text) + stop() | | sharePreset(opts) | POST /api/qcoreai/presets/share | Auth, returns { id } | | browsePresets(query?, limit?) | GET /api/qcoreai/presets/public | Public catalog | | importPreset(id) | POST /api/qcoreai/presets/:id/import | Bumps importCount | | deletePreset(id) | DELETE /api/qcoreai/presets/:id | Owner-only | | refine(runId, instruction, opts?) | POST /api/qcoreai/runs/:id/refine | One-pass surgical edit | | setTags(runId, tags) | PATCH /api/qcoreai/runs/:id/tags | Owner-only, normalized 16x32 | | search(query, limit?) | GET /api/qcoreai/search?q= | Substring + tag match | | topTags(limit?) | GET /api/qcoreai/tags?limit= | Ranked by usage | | timeseries(days?) | GET /api/qcoreai/analytics/timeseries?days= | Daily buckets | | setUserWebhook(url, secret?) | PUT /api/qcoreai/me/webhook | Auth required | | deleteUserWebhook() | DELETE /api/qcoreai/me/webhook | Auth required | | verifyWebhookHmac(body, sig, secret) | — | Receiver-side utility | | createEvalSuite(opts) | POST /api/qcoreai/eval/suites | Auth | | listEvalSuites(limit?) | GET /api/qcoreai/eval/suites | Auth | | getEvalSuite(id) | GET /api/qcoreai/eval/suites/:id | Owner | | updateEvalSuite(id, patch) | PATCH /api/qcoreai/eval/suites/:id | Owner | | deleteEvalSuite(id) | DELETE /api/qcoreai/eval/suites/:id | Owner | | runEvalSuite(id, opts?) | POST /api/qcoreai/eval/suites/:id/run | Async, returns in-flight EvalRun | | getEvalRun(id) | GET /api/qcoreai/eval/runs/:id | Poll for progress | | listSuiteRuns(id, limit?) | GET /api/qcoreai/eval/suites/:id/runs | Regression history | | runEvalSuiteAndWait(id, opts?) | — | Convenience: kick off + poll until done |

Browser usage

The client uses standard fetch and ReadableStream — works in browsers without polyfills. For SSE you can either let the SDK buffer (use runSync) or iterate (runStream) — same code in Node and browsers.

Auth

Owner-scoped endpoints (sessions list, run rename/delete, tags, webhook config, search results scoped to your user) require a JWT in Authorization: Bearer <token>. Pass the token at construction time or rotate via setToken.

The runSync / runStream and public search (anonymous-only results) work without auth — useful for embedding QCoreAI in unauthenticated public landing pages.