@supermachine/core

v0.7.103

Published

8 days ago

Run any OCI/Docker image as a hardware-isolated microVM. Node/Bun/Deno binding for the supermachine Rust crate.

Downloads

6,429

0High
0Medium
0Low

nedomas

microvm hypervisor oci docker snapshot sandbox vm macos hvf apple-silicon

@supermachine/core

Run any OCI/Docker image as a hardware-isolated microVM, from Node.js / Bun / Deno. Sub-100 ms cold start, ~3 MiB RAM per idle VM, full API parity with the supermachine Rust crate via napi-rs.

npm install @supermachine/core

No Rust toolchain required. The package ships a prebuilt, code-signed native addon via the @supermachine/core-darwin-arm64 optional dependency — npm's os / cpu filter installs only on macOS Apple Silicon (the current only supported platform). At runtime you need:

macOS Apple Silicon (arm64)
Node ≥ 18.17, Bun, or Deno 2+
Network access on first Image.build() (OCI registry pull via curl, no Docker daemon required — same pull model as libkrun / krunvm). After bake, snapshots are local and offline-usable.

For private registries, supermachine reads ~/.docker/config.json (the same auth file Docker writes via docker login). You can create that file by hand if you don't have Docker installed. To force the legacy Docker-daemon-based pull path, set SUPERMACHINE_IMAGE_SOURCE=docker.

import { Image } from "@supermachine/core";

const image = await Image.build({ ref: "alpine:latest" });
const pool  = await image.pool({ min: 4, max: 16 });
const vm    = await pool.acquire();
try {
  const { stdout, exitCode } = await vm.exec({
    argv: ["echo", "hello from the VM"],
    timeoutMs: 5_000,
  });
  console.log(stdout.toString());   // "hello from the VM\n"
} finally {
  await vm.release();
}

Why microVMs

Same hardware isolation as Firecracker, libkrun, or qemu — but the acquire latency you'd expect from a container. Designed for use cases where you'd otherwise pick between containers (fast but shared-kernel) and full VMs (isolated but slow):

Agent runtimes — spin up a fresh, hardware-isolated environment per agent task, run untrusted code, throw it away. ~5 ms cycle, no kernel-namespace escapes to worry about.
CI / build verifiers — compile + run untrusted code (MultiPL-E, Codeforces solutions, student submissions) without trusting the host.
Per-tenant sandboxes — run customer code per request without a fork+exec security boundary.
Code execution APIs — same as the OpenAI Code Interpreter pattern, hardware-isolated.

If your bar is "fast enough fork" and you trust the workload, use containers. If your bar is "I'm running this code from an LLM and the host needs to survive", use this.

Status

Currently supported: macOS on Apple Silicon (darwin-arm64, HVF backend). Roadmap: Linux KVM (linux-arm64-gnu, linux-x64-gnu), Windows WHP.

For platforms not yet supported, the package installs cleanly but emits a clear error on first call. The optionalDependencies mechanism means non-supported hosts don't fail npm install.

Performance, measured

All numbers are on Apple Silicon (M-series, 24 GiB dev box), alpine:latest unless noted. The numbers include the napi-rs binding overhead (~10–20 µs per await from libuv worker scheduling). Per-call overhead is 0.3–0.5 % vs. direct Rust embedders — the floor is set by V8 / libuv / HVF, not by the binding.

| Operation | Cost | Notes | |---|---|---| | Cold bake (first run, Docker layer cache warm) | ~400 ms | Pulls layers → squashfs → boots VM → captures snapshot | | Cache-hit bake | ~15 ms | Re-running Image.build for an already-baked image | | First acquire() after build() (warm-handoff) | ~0 ms | Bake's living worker is stashed for the first acquire | | acquire() on pre-spawned idle pool entry | ~0 ms | Just a queue pop | | acquire() with cycle-restore | ~5 ms | HVF-bound; restoration from in-memory snapshot | | acquire() spawn-from-disk (pool growing) | ~22 ms | Fork + dyld + HVF setup + mmap-restore | | vm.exec(["echo", "hi"]) round-trip | ~0.3 ms | vsock RPC to the in-guest agent — no host fork |

Memory

Each idle VM costs ~3 MiB host phys_footprint (measured). The memoryMib build option is a guest-visible ceiling, not host commit — pages are committed lazily via CoW page-faults. Read-only state (kernel, init binaries, your image's rustc/python etc.) is shared across all VMs via the snapshot's mmap, so a pool of 50 VMs uses ~150 MiB total, not 50 × 256 MiB.

The hard limit on parallelism isn't VM count — it's concurrent active workloads. 50 idle VMs ≈ 150 MiB; 50 simultaneous rustc compiles ≈ 15+ GiB. Size your pool against expected active concurrency, not problem count.

Integration patterns

Pattern 1: One-shot exec (CI / verifier)

You have generated code, want to run it in isolation, get the result, throw the VM away.

import { Image } from "@supermachine/core";

const image = await Image.build({
  ref: "python:slim",
  memoryMib: 512,
});

const pool = await image.pool({ min: 0, max: 8, restoreOnRelease: true });

async function runCandidate(source: string): Promise<{ ok: boolean; stdout: string }> {
  const vm = await pool.acquire();
  try {
    const out = await vm.exec({
      argv: ["python3", "/tmp/code.py"],
      stageFiles: { "/tmp/code.py": Buffer.from(source) },
      timeoutMs: 10_000,
    });
    return { ok: out.exitCode === 0, stdout: out.stdout.toString() };
  } finally {
    await vm.release();
  }
}

const results = await Promise.all(
  candidates.map(runCandidate),  // queues through pool of 8
);

restoreOnRelease: true is the right default here — each candidate runs in clean state.

Pattern 2: Agent runtime (per-task isolated sandbox)

import { Image } from "@supermachine/core";

const image = await Image.build({
  ref: "python:slim",
  memoryMib: 512,
  // Pre-warm: import heavy libs once at bake time, snapshot the
  // warm state. Subsequent acquires land on the post-import
  // snapshot — saves ~200 ms of import cost per task.
  warmupTag: "v1",                       // bump to invalidate cache
  warmup: async (vm) => {
    await vm.exec({
      argv: ["python3", "-c", "import numpy, pandas, requests"],
      timeoutMs: 60_000,
    });
  },
});

const pool = await image.pool({
  min: 4,                                // warm idle pool
  max: 32,                               // burst capacity
  restoreOnRelease: true,                // isolate per task
  acquireTimeoutMs: 30_000,
});

// In your request handler:
async function handleAgentTask(taskCode: string) {
  using vm = await pool.acquire();       // Node 20+ disposer
  return await vm.exec({
    argv: ["python3", "/tmp/task.py"],
    stageFiles: { "/tmp/task.py": Buffer.from(taskCode) },
    timeoutMs: 30_000,
  });
}                                        // auto-released

Pattern 3: Persistent service inside the VM

You want to bake an image where the snapshot includes a running listener (nginx, redis, your own daemon) so first restore has it serving immediately.

const image = await Image.build({
  ref: "nginx:alpine",
  listenerRequired: true,   // wait for workload to bind before snapshot
  // (no warmup needed — listener-ready trigger fires when nginx is up)
});

Without listenerRequired, the default .build() uses a pre-exec snapshot trigger for ~10× faster bake on slow-starting workloads — but the listener won't be bound at restore time. The right knob depends on whether your code talks to the workload's listener or runs its own vm.exec commands.

Pattern 4: Resource files baked into the snapshot

Sometimes you need files inside the guest that aren't part of the OCI image — a config, a cert, a binary. extraFiles stages them into the snapshot's filesystem layer:

const image = await Image.build({
  ref: "alpine:latest",
  extraFiles: [
    { hostPath: "./config/app.toml",    guestPath: "/etc/app.toml" },
    { hostPath: "./certs/server.pem",   guestPath: "/etc/server.pem" },
  ],
});

Host file contents are folded into the snapshot's input-hash, so editing the host file invalidates the cache and rebakes automatically.

Pattern 5: Live host directory mount (virtio-fs DAX)

For dev loops, code-mounted-from-host workflows, and "run my project" sandboxes, you want a live host directory inside the guest — edits on the host show up in the guest immediately, and the guest can read/write back. This is what mounts does: a virtio-fs share with DAX (zero-copy reads through the host page cache, kqueue-driven cache invalidation when host files change).

const image = await Image.build({
  ref: "node:20-alpine",
  mounts: [
    {
      hostPath: "/Users/me/my-app",   // host source tree
      guestTag: "workspace",          // guest mounts via this tag
      symlinks: "opaque",             // see below
    },
  ],
});

const vm = await (await image.pool()).acquire();
// Inside the guest:
//   mount -t virtiofs workspace /work
//   cd /work && npm install

symlinks policy picks the trust posture for the mount. Workspace tools (npm workspaces, pnpm, yarn berry) symlink prolifically — node_modules/<member> → ../packages/<member> — so symlink creation has to work. But symlinks are also the classic escape vector if you're running untrusted code. The policy lets integrators pick:

| Value | Guest can create symlinks? | External host symlinks visible/traversable? | When | |---|---|---|---| | "deny" | no (EPERM) | no (EACCES at LOOKUP) | Paranoid mounts: pure file content, no metadata surprises | | "opaque" (default) | yes — targets stored verbatim, host never resolves them | no | Safe multi-tenant default: npm/pnpm/yarn all work; a hostile guest can plant escape → /etc/passwd but the host never follows it | | "follow" | yes | yes | Trusted single-tenant: dev trees that symlink into Homebrew/~/.cache/etc. |

The "opaque" default exists because of how symlink(2) works: the target is opaque bytes stored on the symlink inode — POSIX never validates it. Resolution happens in the guest's kernel against the guest's own namespace, so a target like /etc/passwd resolves to the guest's /etc/passwd, not the host's. The host serves readlink() (returns those bytes) and refuses to follow them. That's the security model: guest can store arbitrary bytes, host never canonicalizes guest-supplied paths.

A runnable example is in examples/06-mount-workspace.mjs.

Performance: don't put dependency caches on the mount

A mount is great for source code (host-editable, DAX-fast reads). It is the wrong place for node_modules, __pycache__, target/, .next/, or anything else that gets thousands of small files written by a build/install step.

Why: every guest file operation on a virtio-fs mount becomes a FUSE message round-trip over vsock — open, write, close, fsync each cost ~µs of in-guest overhead plus ~µs of host-side handling. For one big file (a tarball, an image), that overhead is invisible. For an npm install that creates ~10–30 k tiny files, you're paying milliseconds × tens of thousands of round-trips. We've measured the same npm install at 3–6 minutes on a virtio-fs mount vs 10–30 seconds on a virtio-blk volume backing node_modules. Same workload, same data, same VM — the difference is purely FUSE round-trip cost × small-file count.

Block devices (virtio-blk, exposed via volumes) bypass FUSE entirely — the guest's own filesystem driver writes to a backing file on the host with no per-syscall protocol overhead. Same speed as a "normal" disk inside the guest.

Pattern 6: npm / pnpm / cargo workflow (mount for source, volume for cache)

The shape that gets you both live host edits and fast installs:

const image = await Image.build({
  ref: "node:20-alpine",
  // Source tree — live host edits, DAX reads
  mounts: [{ hostPath: "/Users/me/my-app", guestTag: "src", symlinks: "opaque" }],
  // Dependency cache — fast writes, persists between bakes via the snapshot
  volumes: [
    { name: "node_modules", sizeMib: 1024, mountPoint: "/work/node_modules" },
  ],
  // One-time install at bake. Once snapshotted, every restore has
  // node_modules already populated — no install on the hot path.
  warmup: async (vm) => {
    await vm.exec({ argv: ["mount", "-t", "virtiofs", "src", "/work"] });
    await vm.exec({ argv: ["cp", "-r", "/work/package*.json", "/work/node_modules/.."] });
    await vm.exec({ argv: ["sh", "-c", "cd /work && npm install"], timeoutMs: 600_000 });
  },
});

Three layers, each playing to their strengths:

mounts for source: read-only-ish on the hot path, DAX zero-copy, host edits show up immediately (kqueue invalidation).
volumes for write-heavy caches: native-speed writes, no FUSE overhead, persists across restores.
warmup + bake to do the install once: every restore reuses the baked node_modules; cold install is amortized across all future runs of this snapshot.

A runnable example is in examples/07-npm-workspace-fast.mjs.

Full API

Auto-generated from the Rust source — see index.d.ts for the typed surface. Highlights below; runnable demos in examples/.

`Image`

class Image {
  static build(options: BuildOptions): Promise<Image>;
  static fromSnapshot(path: string): Promise<Image>;
  pool(options?: PoolOptions): Promise<Pool>;

  // Sync accessors — read from snapshot metadata, cached.
  readonly snapshotPath: string;
  readonly memoryMib: number;
  readonly vcpus: number;
}

BuildOptions:

| Field | Type | Default | Notes | |---|---|---|---| | ref | string | required | OCI ref: "alpine:latest", "ghcr.io/owner/img@sha256:..." | | name | string | sha-derived | Stable snapshot name | | memoryMib | number | 256 | Guest-visible ceiling (lazy-committed) | | vcpus | number | 1 | vCPUs per VM | | pullPolicy | "always" \| "missing" \| "never" | "missing" | docker-style pull policy | | cmd | string[] | (image CMD) | Override the image's CMD / ENTRYPOINT | | env | Record<string, string> | {} | Extra workload env vars | | guestPort | number | 80 | Listener port (used with listenerRequired) | | listenerRequired | boolean | false | Wait for workload's listener before snapshotting | | extraFiles | ExtraFile[] | — | Stage host files into guest at fixed paths | | mounts | MountSpec[] | — | Live host dir mounts via virtio-fs DAX. See Pattern 5. | | warmupTag | string | "default" | Cache key for the warmup; bump to invalidate | | warmup | (vm: WarmupVm) => Promise<void> | — | Post-bake state pre-population | | snapshotsDir | string | ~/.local/supermachine-snapshots | Where to write the snapshot |

`Pool`

class Pool {
  acquire(): Promise<Vm>;
  stats(): PoolStats;          // sync, ~1 µs — safe to poll
  shutdown(): Promise<void>;
}

interface PoolStats {
  alive: number;     // total workers (idle + checked-out)
  inUse: number;     // checked out via acquire()
  idle: number;
  waiting: number;   // acquire() callers blocked on a free slot
  min: number;
  max: number;
}

PoolOptions:

| Field | Type | Default | Notes | |---|---|---|---| | min | number | 1 | Idle VMs to keep warm | | max | number | 1 | Maximum total VMs | | acquireTimeoutMs | number | 60 000 | Max wait when pool is saturated. Pass 0 for "block forever" (batch workloads). | | idleTimeoutMs | number | — | Evict idle VMs above min after this | | restoreOnRelease | boolean | true | Clean state per acquire (vs. dirty reuse) | | memoryMib | number | (image default) | Runtime memory override (no re-bake) | | vcpus | number | (image default) | Runtime vCPU override (no re-bake) | | restoreTimeoutMs | number | 30 000 | Per-worker snapshot-restore deadline |

`Vm`

class Vm {
  // Execution
  exec(options: ExecOptions): Promise<ExecOutput>;
  spawn(options: ExecOptions): Promise<ExecChild>;   // streaming variant

  // File I/O (4 MiB cap each direction)
  writeFile(path: string, bytes: Buffer): Promise<void>;
  readFile(path: string): Promise<Buffer>;

  // Workload control
  workloadSignal(signum: number): Promise<void>;     // SIGTERM/SIGHUP/etc. to PID-1
  exposeTcp(hostPort: number, guestPort: number): Promise<TcpForwarder>;

  // Mid-flight snapshot
  snapshot(destDir: string): Promise<Image>;

  // Lifecycle
  release(): Promise<void>;
  dispose(): Promise<void>;        // alias; Symbol.asyncDispose-friendly

  // Escape hatches (raw socket paths for bring-your-own-client)
  readonly vsockPath: string;
  readonly execPath: string;
}

ExecOptions:

| Field | Type | Default | Notes | |---|---|---|---| | argv | string[] | required | argv[0] resolved via PATH inside the guest | | stageFiles | Record<string, Buffer> | {} | Files to drop into the guest before exec (mode 0o644) | | stageFilesMode | StagedFileMode[] | — | Same as stageFiles but with explicit per-file modes (e.g. 0o755 for scripts) | | timeoutMs | number | 60 000 | Hard wall-clock kill (collect-mode exec only) | | env | Record<string, string> | {} | Per-exec env (merged with image env) | | cwd | string | / | Working directory inside the guest | | chain | string[] | — | Run if first argv exits 0 (saves one round-trip) | | tty | boolean | false | Allocate a pty; stdout/stderr merge onto stdout | | winsize | {cols, rows} | — | Initial pty size (tty only). Resize later via ExecChild.resize() |

ExecOutput (collect-mode):

interface ExecOutput {
  stdout: Buffer;
  stderr: Buffer;
  exitCode: number;     // -1 if killed by signal / timeout
}

`ExecChild` (streaming, returned by `vm.spawn()`)

Pull-style bidirectional I/O. Use when you want incremental input/output (LLM-style streaming, interactive REPLs, long-running daemons) instead of waiting for the whole exec to finish.

class ExecChild {
  writeStdin(bytes: Buffer): Promise<void>;
  closeStdin(): Promise<void>;
  readStdout(maxBytes?: number): Promise<Buffer>;   // returns 0-length on EOF
  readStderr(maxBytes?: number): Promise<Buffer>;
  signal(signum: number): Promise<void>;            // e.g. 15=SIGTERM, 9=SIGKILL
  resize(cols: number, rows: number): Promise<void>;// tty resize; no-op in pipe mode
  wait(): Promise<ExecWaitResult>;
}

interface ExecWaitResult {
  exitCode: number;
  timedOut: boolean;
  peakRssKib: number | null;
}

`TcpForwarder` (returned by `vm.exposeTcp()`)

Proxies 127.0.0.1:hostPort into the guest's vsock mux. In-flight connections survive stop().

class TcpForwarder {
  readonly localAddr: string;     // e.g. "127.0.0.1:54321"
  stop(): Promise<void>;          // idempotent
}

`using` syntax (Node 20+)

If you have explicit-resource-management (TC39) enabled:

{
  using vm = await pool.acquire();
  await vm.exec({ argv: ["echo", "auto-released"] });
}   // vm.release() runs at end of block

For older Node / Bun / Deno without Symbol.asyncDispose, use try/finally:

const vm = await pool.acquire();
try {
  await vm.exec({ argv: ["echo", "manual"] });
} finally {
  await vm.release();
}

Compatibility

| Runtime | Status | |---|---| | Node.js ≥ 18.17 | supported (LTS line, N-API 9) | | Node.js ≥ 20 | supported + Symbol.asyncDispose available | | Bun ≥ 1.0 | supported (same .node binary) | | Deno ≥ 2 | supported (same .node binary, --unstable-node-api for older) | | Browsers / Cloudflare Workers / Edge | not supported (no HVF available; see @supermachine/client roadmap) |

The native binary is a single .node file built with napi-rs; the N-API ABI is stable so runtime upgrades don't require rebuilds.

Troubleshooting

hv_vm_create: 0xfae94007 — worker binary lost its HVF entitlement (usually after a manual cargo build strip). The published npm binary is pre-signed; if you're running from source, run npm run build which re-signs.

registry unreachable — Docker registry DNS failed. Check your network. Pre-pull images you depend on with docker pull <image> to surface this earlier.

snapshot path not found — calling Image.fromSnapshot immediately after Image.build can race the bg disk save. Either wait for the file to exist or just keep the Image handle from build() instead of reloading.

Pool seems to hang under load — likely cycle-restore-on-release contention. Set restoreOnRelease: false for throughput-style workloads if you don't need state isolation between acquires.

Memory grows over time — first sample after balloon convergence (~10 s); pre-balloon RSS is misleading. Use macOS footprint -p <pid> for accurate per-process memory accounting (RSS double-counts shared snapshot mmap pages).

Versioning

Lockstep with the supermachine Rust crate. Each release ships a matching @supermachine/core and platform binary (e.g. @supermachine/core-darwin-arm64) with the same version number. Pin in your package.json with an exact match if you bake snapshots and don't want the snapshot format to drift.

License

Apache-2.0

Source

GitHub: https://github.com/supercorp/supermachine Rust crate: https://crates.io/crates/supermachine

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@supermachine/core

Why microVMs

Status

Performance, measured

Memory

Integration patterns

Pattern 1: One-shot exec (CI / verifier)

Pattern 2: Agent runtime (per-task isolated sandbox)

Pattern 3: Persistent service inside the VM

Pattern 4: Resource files baked into the snapshot

Pattern 5: Live host directory mount (virtio-fs DAX)

Performance: don't put dependency caches on the mount

Pattern 6: npm / pnpm / cargo workflow (mount for source, volume for cache)

Full API

Image

Pool

Vm

ExecChild (streaming, returned by vm.spawn())

TcpForwarder (returned by vm.exposeTcp())

using syntax (Node 20+)

Compatibility

Troubleshooting

Versioning

License

Source

`Image`

`Pool`

`Vm`

`ExecChild` (streaming, returned by `vm.spawn()`)

`TcpForwarder` (returned by `vm.exposeTcp()`)

`using` syntax (Node 20+)