thoughtgear

v0.1.8

Published

a month ago

Minimal agent loop: send a prompt, the model can call tools, and the loop persists transcript + state until the model is done.

0High
0Medium
0Low

dsilva2401

agent llm openai anthropic gemini tools function-calling prompt ai

ThoughtGear

A small agent loop: send a prompt, the model can call tools, the loop persists transcript + state and iterates until the model is done. It's inspired (literally) in OpenClaw source code.

For the full design walkthrough, see PROMPT_HANDLER.md.

Install

npm install thoughtgear

Quick start

import { PromptHandler } from "thoughtgear";

const handler = new PromptHandler({
  context: "You are a friendly assistant. Reply in one short sentence.",
  tools: [],
  model: { name: "gpt-4o-mini", provider: "openai", apiKey: process.env.OPENAI_API_KEY! },
  db: { type: "memory" },
  callbacks: {
    onPartialReply: (chunk) => process.stdout.write(chunk),
    onDone: () => console.log("\n[done]"),
  },
});

await handler.handlePrompt({ text: "Hello" });

That's the whole loop: no tools, model streams a reply, onDone fires.

Adding a tool

A tool is a { key, description, content, handler } object. The handler runs when the model calls it; its return string is sent back to the model.

const rollDice = {
  key: "roll_dice",
  description: "Roll a 6-sided die.",
  content: "Returns a random integer between 1 and 6.",
  handler: async () => String(Math.floor(Math.random() * 6) + 1),
};

const handler = new PromptHandler({
  context: "When the user asks for a dice roll, call the roll_dice tool.",
  tools: [rollDice],
  model: { name: "gpt-4o-mini", provider: "openai", apiKey: process.env.OPENAI_API_KEY! },
  db: { type: "memory" },
});

await handler.handlePrompt({ text: "Roll a die for me." });

The model calls roll_dice, the handler runs, the result is fed back, and the model replies with the number.

Tool with typed arguments

Add a parameters JSONSchema so the model knows what to pass. The arguments arrive on params.

const addNumbers = {
  key: "add_numbers",
  description: "Add two numbers.",
  content: "Returns a + b.",
  parameters: {
    type: "object",
    properties: { a: { type: "number" }, b: { type: "number" } },
    required: ["a", "b"],
  },
  handler: async ({ params }) => {
    const { a, b } = params as { a: number; b: number };
    return String(a + b);
  },
};

Chaining tools

The loop handles chaining automatically. Give the model multiple tools and it'll call them in sequence, feeding each result into the next call:

const handler = new PromptHandler({
  context: "Use the tools for every arithmetic step. Never compute in your head.",
  tools: [addNumbers, multiplyNumbers, squareRoot],
  model: { name: "gpt-4o-mini", provider: "openai", apiKey: process.env.OPENAI_API_KEY! },
  db: { type: "memory" },
});

await handler.handlePrompt({
  text: "Take 7 + 9, multiply by 12, then square root it. Give me the final answer.",
});
// → add_numbers(7,9)=16 → multiply_numbers(16,12)=192 → square_root(192)=13.856

Sessions (multi-turn conversations)

By default every handlePrompt call is an independent run — the model has no memory of earlier calls. Pass a sessionId in the constructor to bind the handler to a session: every prompt, assistant reply, and tool result is persisted under that sessionId, and on the next call the model is fed the entire prior history.

const handler = new PromptHandler({
  sessionId: "user-42",     // ← bind to a session
  context: "You are a helpful assistant.",
  tools: [],
  model: { name: "gpt-4o-mini", provider: "openai", apiKey: process.env.OPENAI_API_KEY! },
  db: { type: "memory" },
});

await handler.handlePrompt({ text: "My name is Diego." });
await handler.handlePrompt({ text: "What's my name?" });
// → the model sees the prior exchange and answers "Diego".

Notes:

The sessionId is just a string you choose (e.g. a user ID, a chat thread ID, a UUID).
Without sessionId, each handlePrompt is isolated — the model only sees that single prompt and the loop's own tool calls.
Session history is loaded via orm.getSessionHistory(sessionId). The in-memory, files, and S3 adapters fully implement this; the Mongo / SQL adapters are stubs (one-liner query you fill in).
Resume a session in a different process by passing the same sessionId and pointing at the same persistent DB (e.g. db: { type: "files", path: "./.thoughtgear" } or db: { type: "s3", bucket: "my-bucket" }).

Streaming callbacks

Subscribe to whichever you need:

callbacks: {
  onPartialReply: (chunk, runId) => { /* text tokens as they stream */ },
  onToolStart:    (call, runId)  => { /* model invoked a tool */ },
  onToolResult:   (res, runId)   => { /* handler returned */ },
  onDone:         (runId)        => { /* run finished */ },
}

Persistence: `orm` or `db`

PromptHandler accepts either a pre-built orm or raw db settings. The db form is just a shortcut — internally the handler constructs an ORM from your DbConfig and picks the right adapter (memory / files / s3 / mongodb / sql).

Shortcut: pass `db` settings

const handler = new PromptHandler({
  context: "You are a helpful assistant.",
  tools: [],
  model: { name: "gpt-4o-mini", provider: "openai", apiKey: process.env.OPENAI_API_KEY! },
  db: { type: "mongodb", uri: "mongodb://localhost:27017", database: "thoughtgear" },
});

const { runId } = await handler.handlePrompt({ text: "Hello" });

Other supported db shapes:

db: { type: "memory" }
db: { type: "files", path: "./.thoughtgear" }
db: { type: "s3", bucket: "my-bucket", path: "thoughtgear/prod", region: "us-east-1" }
db: { type: "mongodb", uri: "...", database: "..." }
db: { type: "sql", dialect: "postgres", uri: "..." }

Files adapter (`type: "files"`)

Zero-dependency, on-disk JSON persistence — ideal for local development, CLIs, and single-process apps that don't want to stand up a database. Pass a directory and the adapter writes:

{path}/
  sessions/{sessionId}.json   # all messages + run states for the session
  runs/{runId}.json           # for runs without a sessionId
  cache.json
  memory.json

const handler = new PromptHandler({
  sessionId: "user-42",
  context: "You are a helpful assistant.",
  tools: [],
  model: { name: "gpt-4o-mini", provider: "openai", apiKey: process.env.OPENAI_API_KEY! },
  db: { type: "files", path: "./.thoughtgear" },
});

Writes are atomic (write-temp-then-rename) but there is no cross-process locking — concurrent writers to the same session file can race. That's fine for single-process use; reach for s3 / mongodb if you need a multi-writer story.

S3 adapter (`type: "s3"`)

Same layout as the files adapter, but keys live in an S3 bucket under an optional prefix:

{bucket}/{path}/
  sessions/{sessionId}.json
  runs/{runId}.json
  cache.json
  memory.json

const handler = new PromptHandler({
  sessionId: "user-42",
  context: "You are a helpful assistant.",
  tools: [],
  model: { name: "gpt-4o-mini", provider: "openai", apiKey: process.env.OPENAI_API_KEY! },
  db: {
    type: "s3",
    bucket: "my-bucket",
    path: "thoughtgear/prod",        // optional key prefix
    region: "us-east-1",              // optional — falls back to AWS_REGION env
    credentials: {                    // optional — omit to use the default credential chain
      accessKeyId: process.env.AWS_ACCESS_KEY_ID!,
      secretAccessKey: process.env.AWS_SECRET_ACCESS_KEY!,
    },
  },
});

region and credentials are both optional. When omitted, the AWS SDK's default credential chain (env vars, shared config file, IAM role) is used — typical for apps running on EC2 / ECS / Lambda. Same race caveat as the files adapter applies (S3 has no atomic compare-and-swap).

Bring your own ORM

Use this when you want to share one ORM across multiple handlers, or you need to read the transcript back yourself:

import { ORM, PromptHandler } from "thoughtgear";

const orm = new ORM({
  type: "mongodb",
  uri: "mongodb://localhost:27017",
  database: "thoughtgear",
});

const handler = new PromptHandler({
  context: "You are a helpful assistant.",
  tools: [],
  model: { name: "gpt-4o-mini", provider: "openai", apiKey: process.env.OPENAI_API_KEY! },
  orm,
});

const { runId } = await handler.handlePrompt({ text: "Hello" });

// Later — even from a different process — read the transcript back:
const history = await orm.getHistory(runId);
const state   = await orm.getRunState(runId);

Adapter status:

memory, files, s3 — fully implemented.
mongodb, sql — stubbed in src/classes/PromptHandler.ts; fill in the eight OrmAdapter methods using the mongodb / pg / kysely drivers to make them live. Mongo collections used: messages, run_states, cache, memory.

Executors

Each iteration of the agent loop is stateless against the ORM — the run, transcript, and tool results are persisted before the iteration returns. An Executor decides how the next iteration gets driven. Same loop semantics either way; the choice is operational.

interface Executor {
  scheduleNextIteration(runId: string): Promise<void>;
}

Two are built in. You pass one via executor on the constructor; the default is LocalExecutor.

`LocalExecutor` (default)

Drives the next iteration in the same process by awaiting handler.continueRun(runId). This is what you want for any single-process app — a script, a server handling a request end-to-end, a CLI, tests.

import { PromptHandler, LocalExecutor } from "thoughtgear";

const handler = new PromptHandler({
  context: "...",
  tools: [...],
  model: { ... },
  db: { type: "memory" },
  // executor: new LocalExecutor(),   // implicit — this is the default
});

await handler.handlePrompt({ text: "..." });   // resolves when the whole run finishes

handlePrompt / continueRun resolve only once the model is done iterating, so callers can await the full run.

`LambdaExecutor`

Persists state, fires a fresh invocation of your Lambda with { runId, action: "continue" }, and returns immediately. The next tick of the loop runs in a new invocation that loads state from the shared ORM.

import { PromptHandler, LambdaExecutor, makeLambdaHandler } from "thoughtgear";
import { LambdaClient, InvokeCommand } from "@aws-sdk/client-lambda";

const lambda = new LambdaClient({});
const executor = new LambdaExecutor(async (payload) => {
  await lambda.send(new InvokeCommand({
    FunctionName: process.env.SELF_FUNCTION_NAME!,   // this function's own ARN/name
    InvocationType: "Event",                          // fire-and-forget
    Payload: Buffer.from(JSON.stringify(payload)),
  }));
});

const handler = new PromptHandler({
  context: "...",
  tools: [...],
  model: { ... },
  db: { type: "s3", bucket: "my-bucket", path: "thoughtgear/prod" },
  executor,
});

export const lambdaHandler = makeLambdaHandler(handler);

makeLambdaHandler routes events for you:

type LambdaEvent =
  | { action: "start";    text: string; files?: FileAttachment[] }
  | { action: "continue"; runId: string };

So one Lambda function serves both the initial prompt (action: "start") and every continuation tick (action: "continue").

Requirements:

Shared persistence. Use s3, mongodb, or sql — memory won't survive an invocation boundary and files is single-host.
Self-invoke permission. The Lambda's IAM role needs lambda:InvokeFunction on its own ARN, plus whatever the persistence adapter needs.
Idempotency. With fire-and-forget invocations, an upstream retry could in theory schedule the same runId twice; the ORM has no atomic compare-and-swap on files/s3. In practice this is rare, but worth knowing if you're at high volume.

When to pick which

| Scenario | Executor | Why | | --- | --- | --- | | Local script, CLI, single-process server | LocalExecutor | No infra needed; awaitable end-to-end. | | HTTP server returning the final answer in one response | LocalExecutor | The request handler awaits the whole loop. | | HTTP server returning runId immediately, client polls | either | Use Local with a background worker, or Lambda for serverless. | | Long-running agent runs (many tool calls, big chains) | LambdaExecutor | Each iteration fits inside one invocation — no 15-min Lambda cap risk. | | Bursty workloads, scale-to-zero | LambdaExecutor | Pay only for active iterations; no idle worker. | | Same code in dev and prod | both | Swap the executor at construction time; everything else stays identical. |

Custom executors

Anything that implements scheduleNextIteration(runId) works. Useful scenarios:

Queue-backed worker — push { runId, action: "continue" } to SQS / Redis / Cloud Tasks; a separate worker pool dequeues and calls continueRun(runId). Buys you backpressure and retries the framework doesn't give you natively.
Cron / scheduled continuation — schedule the next tick instead of firing it immediately (e.g. to throttle, or wait on an external event).
Cross-region failover — invoke a Lambda in a different region when the primary is degraded.

Skeleton:

import { Executor, PromptHandler } from "thoughtgear";

class SqsExecutor implements Executor {
  constructor(private queueUrl: string, private sqs: SQSClient) {}
  async scheduleNextIteration(runId: string) {
    await this.sqs.send(new SendMessageCommand({
      QueueUrl: this.queueUrl,
      MessageBody: JSON.stringify({ runId, action: "continue" }),
    }));
  }
}

Your worker then reads the queue and calls handler.continueRun(runId) per message.

Switching providers

Just change model.provider:

model: { name: "claude-opus-4-7", provider: "anthropic", apiKey: "..." }
model: { name: "gemini-2.5-pro",  provider: "google",    apiKey: "..." }
model: { name: "gpt-4o-mini",     provider: "openai",    apiKey: "..." }
model: { name: "mock",            provider: "mock",      apiKey: "" }  // for tests

Optional: `maxTokens`

Cap the output per response with model.maxTokens. Mappings:

| Provider | Forwarded as | Default when omitted | | --------- | ------------------ | ----------------------------- | | anthropic | max_tokens | 4096 (Anthropic requires it) | | openai | max_completion_tokens | provider default | | google | maxOutputTokens | provider default |

model: {
  name: "claude-opus-4-7",
  provider: "anthropic",
  apiKey: "...",
  maxTokens: 8192,
}

Resilience: automatic error retries

The loop is self-healing. If an iteration fails for any of these reasons, the run does not terminate — instead a system-role note is appended to the transcript describing the error, and the next iteration is scheduled so the model can see the failure and try again:

llm.stream() throws (network blip, rate limit, malformed response).
The model returns stopReason: "error" (refusal / safety filter / content filter).
One or more tool handlers throw (any tool_result with isError: true).

After maxErrorRetries consecutive failures (default 10), the run is failed with lastError describing what tripped the cap. A successful turn resets the counter to 0, so the budget protects against persistent breakage without punishing intermittent flakes.

new PromptHandler({
  context: "...",
  tools: [...],
  model: {...},
  db: { type: "memory" },
  maxErrorRetries: 10,   // optional; defaults to 10
  maxIterations: 16,     // optional; defaults to 16
});

Error retries are budgeted separately from maxIterations — a failed attempt does not consume an iteration slot. A run that keeps failing terminates via maxErrorRetries; a run that keeps making real progress terminates via maxIterations.

Running the tests

# Put OPENAI_API_KEY in tests/.env
npm test

The test suite covers a plain greeting, a single-tool call, and a 3-tool chain.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme