@longrun/sandstorm
v0.1.15
Published
Fly.io Machines sandbox management library
Downloads
636
Readme
Sandstorm
Fly.io Machines sandbox management library. Provides a high-level API for creating, pausing, resuming, and deleting isolated sandboxes backed by Fly.io Machines, with built-in support for coding agents and persona configuration.
Installation
npm install @longrun/sandstormQuick Start
import { SandboxManager } from '@longrun/sandstorm';
const manager = new SandboxManager({
apiToken: process.env.FLY_API_TOKEN!,
orgSlug: 'my-org',
instanceImage: 'flyio/claude-code:latest',
instancePrefix: 'myapp-',
});
// Create a sandbox
const sandbox = await manager.createSandbox({
name: 'project-alice',
env: { GITHUB_TOKEN: '...' },
});
console.log(sandbox.url); // https://myapp-project-alice.fly.dev
// Pause (snapshot + delete machine, keep volume)
const { snapshotId } = await manager.pauseSandbox(sandbox.appName);
// Resume (restore from snapshot)
const resumed = await manager.resumeSandbox(sandbox.appName, snapshotId);
// Delete (permanent)
await manager.deleteSandbox(sandbox.appName);Architecture
Sandstorm has two API layers:
SandboxManager (High-Level)
The primary user-facing interface. Orchestrates complex operations into single calls.
| Method | Description |
|--------|-------------|
| createSandbox(params) | Creates app + volume + machine in one call |
| deleteSandbox(id) | Deletes machine + volume + app |
| pauseSandbox(id) | Snapshots volume, deletes machine (reversible) |
| resumeSandbox(id, snapshotId?, options?) | Restores from snapshot, recreates machine |
| updateSandboxEnv(id, env) | Updates machine environment variables |
| getSandbox(id) | Gets current sandbox status |
| syncSandbox(appName, agent, persona, vars) | Re-deploys persona config to a running sandbox |
| connectInteractive(appName, agent, workingDir?, options?) | Starts an interactive terminal session via SSH |
| execCommand(appName, command) | Executes a command inside a sandbox |
| writeFile(appName, filePath, content) | Writes a file inside a sandbox |
| startAgentSession(appName, agent, id, prompt, env?) | Starts a non-interactive agent session in background |
| isAgentRunning(appName, agent) | Checks if an agent process is running |
| getSessionLogs(appName, agent, id, options?) | Gets session log content |
| watchLogs(appName, agent, id, onLine, options?) | Watches logs with polling-based tail |
| listSessions(appName, agent, id) | Lists all session log files |
| findIdleSandboxes(minutes) | Finds sandboxes with low network activity |
| runLifecycleCheck(config) | Runs lifecycle management with hooks |
Low-Level Components
Available for advanced usage via manager.client, manager.metricsClient, and manager.lifecycleManager.
- FlyMachines - Raw Machines API wrapper (apps, machines, volumes, snapshots)
- FlyMetrics - Prometheus metrics for activity detection
- LifecycleManager - State verification and reconciliation helpers
Agent System
Built-in support for coding agents that run inside sandboxes.
- ClaudeCodeAgent - Claude Code agent implementation
- claudeCode - Singleton instance of
ClaudeCodeAgent - getCodingAgent(id) - Get an agent by ID
- getDefaultCodingAgent() - Get the default agent (Claude Code)
Persona Utilities
- loadPersonaDir(dirPath) - Load all files from a persona directory into a
Map<string, string> - resolvePersona(persona) - Resolve a persona source (directory path or in-memory Map)
API Reference
SandboxManager
new SandboxManager(config: SandstormConfig)Config
interface SandstormConfig {
apiToken: string; // Fly API token
orgSlug: string; // Fly organization slug
instanceImage: string; // Docker image for machines
instancePrefix: string; // Prefix for app names (e.g., 'myapp-')
}createSandbox
await manager.createSandbox({
name: string; // Used in app naming: {prefix}{name}
region?: string; // Default: 'ord'
env: Record<string, string>;
volumeSizeGb?: number; // Default: 1
memoryMb?: number; // Default: 2048
cpus?: number; // Default: 2
swapSizeMb?: number; // Default: 1024
agent?: CodingAgent; // Agent to install and configure
persona?: string | Map<string, string>; // Persona dir path or in-memory files
personaVars?: Record<string, string>; // Template variable substitution: {{key}} -> value
repo?: { // Git repo to clone after machine starts
url: string;
branch?: string; // Default: 'main'
token?: string; // For private repos (injected into clone URL)
path?: string; // Clone target path (default: '/data/project')
};
}): Promise<Sandbox>Creates a Fly app, persistent volume, machine, optionally clones a repo, and configures an agent with a persona. Returns a Sandbox with the machine ID, app name, volume ID, region, status, public URL, and creation timestamp.
On failure, the app is automatically cleaned up.
pauseSandbox / resumeSandbox
// Pause: snapshot volume, delete machine
const { snapshotId } = await manager.pauseSandbox(appName);
// Resume: restore volume from snapshot, create new machine
const sandbox = await manager.resumeSandbox(appName, snapshotId, {
env: { ... }, // Optional: override env vars
memoryMb: 4096, // Optional: override resources
cpus: 4,
swapSizeMb: 2048,
});Pausing is reversible - the app and volume remain, only the machine is deleted. Cost while paused is just volume storage ($0.15/GB/month).
If snapshotId is omitted from resumeSandbox, the latest snapshot is used.
updateSandboxEnv
await manager.updateSandboxEnv(appName, {
NEW_KEY: 'new-value',
EXISTING_KEY: 'updated-value',
});Merges new env vars with existing ones. Changes take effect on next machine restart.
syncSandbox
await manager.syncSandbox(appName, agent, './persona-dir', {
PROJECT_NAME: 'my-project',
});Re-deploys persona configuration to a running sandbox. Re-reads persona files from disk, re-substitutes template variables, and re-deploys to the machine. Useful after updating persona files locally.
connectInteractive
await manager.connectInteractive(appName, agent, '/data/project', {
envFile: '/data/.env', // Optional: source env file before running agent
});Starts an interactive terminal session via fly ssh console. Takes over the terminal with full PTY support. Blocks until the session ends.
execCommand / writeFile
const result = await manager.execCommand(appName, 'ls /data/project');
// result: { stdout, stderr, exitCode }
await manager.writeFile(appName, '/data/config.json', '{"key": "value"}');
// Creates parent directories automatically, uses base64 encoding for safe transportAgent Sessions
// Start a non-interactive agent session in background
const session = await manager.startAgentSession(
appName, agent, 'task-42', 'Fix the login bug', { EXTRA_VAR: '...' }
);
// session: { sessionId, logPath, latestLogPath }
// Check if agent is still running
const running = await manager.isAgentRunning(appName, agent);
// Get logs (specific session or latest)
const lines = await manager.getSessionLogs(appName, agent, 'task-42', {
sessionId: session.sessionId, // Optional: omit for latest
lines: 100, // Optional: omit for all lines
});
// Watch logs in real-time
const watcher = await manager.watchLogs(appName, agent, 'task-42', (line) => {
console.log(line);
}, { lines: 50, intervalMs: 500 });
// Later: watcher.stop()
// List all sessions for an identifier
const sessions = await manager.listSessions(appName, agent, 'task-42');
// sessions: [{ sessionId, logPath, startedAt }]findIdleSandboxes
const idle = await manager.findIdleSandboxes(15); // 15-minute window
// Returns: IdleSandbox[] with sandboxId, appName, machineId, volumeId, lastActivityAt, idleMinutesQueries Fly Prometheus metrics for network activity. Returns sandboxes with less than 10KB total traffic in the measurement window.
Does NOT pause anything - consumer decides what to do.
runLifecycleCheck
const result = await manager.runLifecycleCheck({
// Timeline configuration
warnAfterMinutes: 60, // Optional: warn phase
pauseAfterMinutes: 120, // Required: pause phase
deleteAfterDays: 30, // Optional: delete phase
// Sandboxes to check (from your database)
sandboxes: [
{
sandboxId: 'mach_123',
appName: 'myapp-project-alice',
machineId: 'mach_123',
volumeId: 'vol_456',
lastActivityAt: new Date('2024-01-15'),
lifecycleStatus: 'active',
},
],
// Hooks - you decide what happens
onWarn: async (sandbox) => {
await db.update({ status: 'warned' });
await sendEmail(sandbox.appName, 'Going idle soon');
},
onPause: async (sandbox, snapshotId) => {
await db.update({ status: 'paused', snapshotId });
},
onDelete: async (sandboxId, appName) => {
await db.delete(sandboxId);
},
});
console.log(result); // { warned: 0, paused: 1, deleted: 0, errors: [] }Lifecycle States
active ──[warnAfterMinutes]──> warned ──[pauseAfterMinutes]──> paused ──[deleteAfterDays]──> deleted
(optional) (required) (optional)- active - Machine running, recently used
- warned - Idle for warning threshold (optional phase)
- paused - Machine deleted, volume snapshotted (low cost, reversible)
- deleted - Permanently removed
All phases except pause are optional. Configure only what you need:
// Minimal: just pause after 2 hours, no warning, no delete
{ pauseAfterMinutes: 120 }
// Full lifecycle: warn, pause, delete
{ warnAfterMinutes: 1440, pauseAfterMinutes: 4320, deleteAfterDays: 30 }Activity Detection
Sandstorm detects idle sandboxes via Fly's Prometheus metrics endpoint:
- Queries
fly_instance_net_recv_bytesandfly_instance_net_sent_bytes - Uses
increase()over a configurable time window - Default threshold: 10KB total traffic = active
- Uses regex patterns for efficient batch querying
This approach works for non-HTTP workloads (like Claude Code agents) where Fly's built-in auto_stop_machines cannot detect activity.
Environment Variables
| Variable | Required | Description |
|----------|----------|-------------|
| FLY_API_TOKEN | Yes | Fly.io API token |
The orgSlug, instanceImage, and instancePrefix are passed via constructor config.
Development
npm install
npm run build # Compile TypeScript
npm run test # Run tests (vitest)
npm run test:watch # Run tests in watch mode
npm run lint # Lint source
npm run format # Format code
npm run typecheck # Type checkLicense
MIT
