agent-neckbeard

v1.1.16

Published

13 hours ago

Deploy AI agents to E2B and Daytona sandboxes

0High
0Medium
0Low

zacwellmer

ai agent deploy e2b daytona sandbox claude

neckbeard

Neckbeard deploys your agent code into an E2B or Daytona sandbox and runs it behind a small persistent HTTP server inside that sandbox.

The intent is simple: your production process should not run the agent. The sandbox runs the agent, keeps its filesystem/process state, and exposes a private /invoke endpoint that neckbeard calls for each turn.

This is designed to work from short-lived request handlers and long-running workers, including Vercel Functions, Hatchet workers, and Trigger.dev tasks. See docs/CONSUMER_ENVIRONMENTS.md before changing lifecycle, timeout, streaming, or cleanup behavior for those environments.

What This Does

You write an agent:

import { Agent } from 'agent-neckbeard';
import { query } from '@anthropic-ai/claude-agent-sdk';
import { z } from 'zod';

const agent = new Agent({
  sandbox: {
    provider: 'e2b',
    template: 'code-interpreter-v1',
  },
  inputSchema: z.object({ topic: z.string() }),
  outputSchema: z.object({
    title: z.string(),
    summary: z.string(),
    keyPoints: z.array(z.string()),
  }),
  envs: {
    ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY,
  },
  run: async (input) => {
    for await (const message of query({
      prompt: `Research "${input.topic}" and return JSON`,
      options: { maxTurns: 10 },
    })) {
      if (message.type === 'result') {
        return JSON.parse(message.result ?? '{}');
      }
    }
    throw new Error('Claude did not return a result');
  },
});

const sandboxId = await agent.deploy();
const result = await agent.run({ topic: 'TypeScript generics' }, { sandboxId });

deploy() bundles your agent, uploads agent.mjs and the embedded runtime server to the sandbox, starts node server.mjs, waits for /health, and returns the sandbox ID.

run() validates input locally, POSTs it to the sandbox server, reads SSE harness events, validates the final output, and returns the existing { ok, executionId, output | error } result shape.

By default, the package infers the file to bundle from the new Agent(...) call stack. In frameworks that bundle server code before runtime, set sourceFile to the real agent module so deploy() does not rebundle a framework output chunk:

new Agent({
  sourceFile: path.resolve(process.cwd(), 'packages/my-agent/src/agent.ts'),
  // ...
});

For Next.js/Vercel-style deployments, keep agent-neckbeard external to the server bundle and make sure the referenced source file is included in the production artifact:

// next.config.ts
import { withNeckbeardNextConfig } from 'agent-neckbeard';

const nextConfig = withNeckbeardNextConfig({
  // your existing Next config
}, {
  routeGlob: '/api/my-agent',
  sourceFiles: ['./packages/my-agent/src/agent.ts'],
});

export default nextConfig;

The helper traces files imported by those source entries by default, including package entry metadata that deploy() needs when it rebundles the agent at runtime. For Next 13 configs that still use experimental.serverComponentsExternalPackages, the helper automatically uses the matching experimental tracing fields when it sees that shape. Set outputFileTracingRoot when the agent source is outside the app root.

This repo tests that shape locally with npm run test:compat:next, which builds and runs a Dockerized Next App Router fixture without pushing to Vercel. The fixture also checks the runtime env contract, Node major, and traced standalone files so source-level tests do not mask deployment artifact problems.

Setup

npm install agent-neckbeard

export E2B_API_KEY=your-key
export DAYTONA_API_KEY=your-key
export ANTHROPIC_API_KEY=your-key

Pick the sandbox backend explicitly with sandbox.provider. The public sandbox.template field maps to an E2B template or a Daytona snapshot.

Daytona sandboxes can also receive network policy options supported by the Daytona SDK. Use domainAllowList for domains and wildcard domains, or networkAllowList for comma-separated CIDR ranges:

new Agent({
  sandbox: {
    provider: 'daytona',
    template: 'daytona-small',
    domainAllowList: '*.example.com',
  },
  // ...
});

Set networkBlockAll: true only when you want to block all outbound network access. Daytona rejects networkBlockAll: true combined with non-empty allow lists.

Run-Output Agents

Use run when each invocation should produce one validated output:

new Agent({
  sandbox: {
    provider: 'e2b',
    template: 'code-interpreter-v1',
  },
  inputSchema: z.object({ prompt: z.string() }),
  outputSchema: z.object({ answer: z.string() }),
  run: async (input, ctx) => {
    ctx.logger.info(`Handling ${ctx.executionId}`);
    return { answer: input.prompt.toUpperCase() };
  },
});

The run handler receives a context object with executionId, AbortSignal, environment variables, sandbox paths, and a logger:

run: async (input, ctx) => {
  const notesPath = `${ctx.sandbox.homeDir}/data/notes.txt`;
  const cwd = ctx.sandbox.agentDir;
  // ...
}

Use ctx.sandbox.homeDir and ctx.sandbox.agentDir instead of hardcoding /home/user if you want the same agent code to work on both providers.

Streaming Agents

Use server.handler when you want progress frames while the sandbox work is still running:

const agent = new Agent({
  sandbox: {
    provider: 'daytona',
    template: 'daytona-small',
  },
  inputSchema: z.object({ prompt: z.string() }),
  server: {
    frameSchema: z.object({
      type: z.enum(['status', 'token', 'done']),
      text: z.string().optional(),
    }),
    handler: async ({ input, emit }) => {
      await emit({ type: 'status', text: `Starting ${input.prompt}` });
      await emit({ type: 'token', text: 'hello' });
      await emit({ type: 'done' });
    },
  },
});

const sandboxId = await agent.deploy();

for await (const frame of agent.invoke({ prompt: 'hi' }, { sandboxId })) {
  console.log(frame);
}

Internally, streaming uses Server-Sent Events from the sandbox /invoke endpoint. Neckbeard wraps user frames in harness events so errors, completion, and run-output results are handled consistently.

Reusing Sandboxes

Persist the sandbox ID if another process needs to call the same running sandbox server:

const sandboxId = await agent.deploy();
await saveSandboxId(sandboxId);

// Later:
const savedSandboxId = await loadSandboxId();
const result = await agent.run({ topic: 'hello' }, { sandboxId: savedSandboxId });

Neckbeard rehydrates the sandbox endpoint and provider traffic token from the sandbox ID using the configured provider SDK. E2B traffic tokens are sent as the e2b-traffic-access-token header. Daytona preview tokens are sent as the DAYTONA_SANDBOX_AUTH_KEY query parameter. Consumers only need to store the sandbox ID and configure the same provider/API key in the process that reconnects.

Environment Changes with Reused Sandboxes

envs are applied when Neckbeard creates the sandbox and starts the sandbox server. If the host app was missing a required value such as ANTHROPIC_API_KEY when the sandbox was created, the sandbox can remain healthy but unauthenticated even after you fix the host environment and redeploy the app.

Updating Vercel, Render, or 1Password environment variables does not mutate already-running sandboxes. Kill the old sandbox or deploy a fresh one and replace the persisted sandbox ID. Consumers that keep long-lived sandbox IDs should treat authentication failures such as Not logged in or Please run /login as a signal to discard the sandbox after verifying the host environment is now correct.

Direct Sandbox Access

Use SandboxClient when host code needs provider-neutral filesystem or command access without running an agent invocation:

import { SandboxClient } from 'agent-neckbeard';

const sandbox = await SandboxClient.connect({
  provider: 'e2b',
  sandboxId,
  timeoutMs: 30 * 60 * 1000,
});

await sandbox.makeDir('/home/user/documents');
await sandbox.writeFile('/home/user/documents/input.txt', 'hello');
const result = await sandbox.runCommand('ls documents', {
  cwd: '/home/user',
  timeoutMs: 60_000,
});

SandboxClient.create() and SandboxClient.connect() support E2B and Daytona. E2B supports setTimeout() for sandbox lifetime extension; providers without that lifecycle primitive throw a ConfigurationError.

Configuration

new Agent({
  sandbox: {
    provider: 'e2b' | 'daytona',
    template: string,
    // Daytona only:
    networkBlockAll?: boolean,
    networkAllowList?: string,
    domainAllowList?: string,
  },
  inputSchema: ZodSchema,
  maxDuration?: number,
  dependencies?: {
    apt?: string[],
    commands?: string[],
  },
  files?: [{ url, path }],
  claudeDir?: string,
  envs?: Record<string, string | undefined>,
  // Run-output agents require:
  outputSchema: ZodSchema,
  run: (input, ctx) => Promise<output>,
  // Streaming agents require this instead of outputSchema/run:
  server: {
    port?: number,
    frameSchema: ZodSchema,
    handler: ({ input, ctx, signal, emit }) => Promise<void>,
  },
});

The files option downloads files into the sandbox before the server starts. Relative paths resolve from the sandbox home directory: /home/user on E2B and /home/daytona on Daytona.

The claudeDir option uploads a local .claude/ directory to the sandbox, enabling Claude Agent SDK skills. Point it at a directory containing .claude/skills/*/SKILL.md files.

The envs option passes environment variables to the sandbox. Undefined values are filtered out:

const agent = new Agent({
  sandbox: {
    provider: 'e2b',
    template: 'code-interpreter-v1',
  },
  envs: {
    ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY,
    MY_API_KEY: process.env.MY_API_KEY,
  },
  // ...
});

Some packages cannot be bundled because they spawn child processes or have native modules. The Claude Agent SDK is like this. These packages are automatically marked external and installed with npm in the sandbox.

Cleanup

const sandboxId = await agent.deploy();

try {
  const result = await agent.run(input, { sandboxId });
} finally {
  await agent.kill(sandboxId);
}

kill() accepts a sandbox ID. If omitted, it kills the sandbox cached on the agent instance.

Testing

npm run check

The local check runs typecheck, repository lint checks, formatting hygiene checks, unit tests, generated runtime verification, CJS build verification, and package dry-run validation.

Jest drives both the fast local suite and the live smoke suites. Each live suite automatically skips when its provider key is not set:

npm run test:e2e:e2b
npm run test:e2e:daytona

Optional environment variables:

NECKBEARD_E2E_TEMPLATE to override the default code-interpreter-v1 test template
NECKBEARD_DAYTONA_TEMPLATE to override the default Daytona smoke snapshot (daytona-small)
NECKBEARD_E2E_TIMEOUT_MS to increase the live test timeout for slower environments

License

MIT