inferis-svelte

v1.0.0

Published

2 months ago

Svelte stores and context for inferis-ml

Downloads

0High
0Medium
0Low

pashunechka

inferis svelte stores ml ai webgpu wasm

inferis-svelte

Svelte stores and context for inferis-ml -- run AI models directly in the browser with WebGPU/WASM.

Install

npm install inferis-svelte inferis-ml

Quick Start

<!-- App.svelte (parent) -->
<script>
import { createInferis } from 'inferis-svelte';
import { webLlmAdapter } from 'inferis-ml/adapters/web-llm';

createInferis({ adapter: webLlmAdapter() });
</script>

<slot />

<!-- Chat.svelte (child) -->
<script>
import { useModel, useStream } from 'inferis-svelte';

const { model, state, progress } = useModel('text-generation', {
  model: 'Llama-3.2-1B-Instruct-q4f16_1-MLC',
  autoLoad: true,
});

const { text, isStreaming, start, stop } = useStream(model);

function generate() {
  start({ prompt: 'Explain quantum computing in 3 sentences' });
}
</script>

{#if $state === 'loading'}
  <p>Loading model... {$progress ? Math.round(($progress.loaded / ($progress.total || 1)) * 100) : 0}%</p>
{:else if $state === 'error'}
  <p>Failed to load model</p>
{:else}
  <button on:click={generate}>Generate</button>
  {#if $isStreaming}
    <button on:click={stop}>Stop</button>
  {/if}
  <p>{$text}</p>
{/if}

Setup

Call createInferis() in a parent component's <script> block. All child components can then use any store function.

<script>
import { createInferis } from 'inferis-svelte';
import { webLlmAdapter } from 'inferis-ml/adapters/web-llm';

createInferis({
  adapter: webLlmAdapter(),
  poolConfig: { maxMemoryMB: 4096, maxWorkers: 2 },
});
</script>

The pool is automatically terminated when the component is destroyed.

API Reference

`createInferis(options)`

Initialize the inferis context. Must be called during component initialization (top-level <script>). Sets up the worker pool, detects capabilities, and provides context to descendants.

| Option | Type | Description | |--------|------|-------------| | adapter | ModelAdapterFactory | Required. Adapter from inferis-ml/adapters/* | | poolConfig | Partial<PoolConfig> | Optional pool settings (memory limit, workers, device, etc.) |

`getInferis()`

Raw access to the inferis context. Must be called within a descendant of a component that called createInferis().

Returns InferisContext with stores: pool, capabilities, isReady, error.

`useInferis()`

Alias for getInferis(). Returns the full context.

const { pool, isReady } = useInferis();

| Field | Type | Description | |-------|------|-------------| | pool | Writable<WorkerPool \| null> | Pool instance, null while initializing | | isReady | Readable<boolean> | true when pool is created |

`useCapabilities()`

Device capability detection (WebGPU, WASM SIMD, SharedWorker, etc.).

<script>
import { useCapabilities } from 'inferis-svelte';

const { capabilities, isLoading } = useCapabilities();
</script>

{#if $isLoading}
  <p>Detecting...</p>
{:else if $capabilities?.webgpu.supported}
  <p>GPU: {$capabilities.webgpu.adapter?.vendor}</p>
{/if}

| Field | Type | Description | |-------|------|-------------| | capabilities | Writable<CapabilityReport \| null> | Detection result | | isLoading | Readable<boolean> | true while detecting |

`useModel(task, config)`

Load and manage a model lifecycle. Auto-disposes on component destruction.

const { model, state, progress, error, load, dispose } = useModel('text-generation', {
  model: 'Llama-3.2-1B-Instruct-q4f16_1-MLC',
  autoLoad: true,
});

| Config | Type | Default | Description | |--------|------|---------|-------------| | model | string | -- | Model ID (HuggingFace ID, URL, etc.) | | autoLoad | boolean | false | Load model when pool is ready | | estimatedMemoryMB | number | -- | Memory hint for budget pre-eviction |

| Return | Type | Description | |--------|------|-------------| | model | Writable<ModelHandle \| null> | Model handle for inference | | state | Writable<ModelState \| 'pending'> | Current lifecycle state | | progress | Writable<LoadProgressEvent \| null> | Download/load progress | | error | Writable<Error \| null> | Load error | | load() | () => Promise<void> | Manually trigger loading | | dispose() | () => Promise<void> | Unload model and free memory |

`useInference<T>(model)`

Single (non-streaming) inference request.

const { result, error, isLoading, run, reset } = useInference(model);

const output = await run({ text: 'This movie is great!' });

| Return | Type | Description | |--------|------|-------------| | result | Writable<T \| null> | Last inference result | | error | Writable<Error \| null> | Last error | | isLoading | Writable<boolean> | Request in flight | | run(input, options?) | (input, opts?) => Promise<T> | Execute inference | | reset() | () => void | Clear result and error |

Supports AbortSignal via options:

const controller = new AbortController();
run(input, { signal: controller.signal });
controller.abort();

`useStream<T>(model)`

Streaming inference with chunk accumulation.

const { chunks, text, isStreaming, start, stop, reset } = useStream(model);

start({ prompt: 'Explain quantum computing' });

| Return | Type | Description | |--------|------|-------------| | chunks | Writable<T[]> | All received chunks | | text | Writable<string> | Accumulated text (for string chunks) | | isStreaming | Writable<boolean> | Stream active | | error | Writable<Error \| null> | Stream error | | start(input, options?) | (input, opts?) => void | Start streaming | | stop() | () => void | Abort stream | | reset() | () => void | Clear chunks/text, stop if active |

`useMemoryBudget(intervalMs?)`

Monitor memory usage across loaded models. Polls at the given interval (default 1000ms).

const { totalMB, allocatedMB, availableMB } = useMemoryBudget();

Adapters

inferis-ml ships three adapters. Pass any of them to createInferis:

import { webLlmAdapter } from 'inferis-ml/adapters/web-llm';
import { transformersAdapter } from 'inferis-ml/adapters/transformers';
import { onnxAdapter } from 'inferis-ml/adapters/onnx';

Examples

Image Classification

<script>
import { useModel, useInference } from 'inferis-svelte';

const { model } = useModel('image-classification', {
  model: 'google/vit-base-patch16-224',
  autoLoad: true,
});

const { result, isLoading, run } = useInference(model);

async function classify(event) {
  const file = event.target?.files?.[0];
  if (file) await run(file);
}
</script>

<input type="file" accept="image/*" on:change={classify} />
{#if $isLoading}
  <p>Classifying...</p>
{/if}
{#if $result}
  {#each $result as r}
    <p>{r.label}: {(r.score * 100).toFixed(1)}%</p>
  {/each}
{/if}

Capability Gate

<script>
import { useCapabilities } from 'inferis-svelte';

const { capabilities, isLoading } = useCapabilities();
</script>

{#if $isLoading}
  <p>Detecting device capabilities...</p>
{:else if !$capabilities?.webgpu.supported && !$capabilities?.wasm.supported}
  <p>Your browser does not support WebGPU or WASM.</p>
{:else}
  <slot />
{/if}

Memory Monitor

<script>
import { useMemoryBudget } from 'inferis-svelte';

const { totalMB, allocatedMB } = useMemoryBudget();
$: pct = $totalMB ? Math.round(($allocatedMB / $totalMB) * 100) : 0;
</script>

<div>
  <div style="width: {pct}%; height: 4px; background: {pct > 80 ? '#ef4444' : '#22c55e'}" />
  <p>{$allocatedMB} / {$totalMB} MB</p>
</div>

SvelteKit

ML inference runs in the browser via Web Workers -- there is no server-side model loading. The pool creation is already SSR-safe (typeof window guard), but composables should only be used in browser-rendered components.

<!-- +page.svelte -->
<script>
import { browser } from '$app/environment';
import { createInferis } from 'inferis-svelte';
import { webLlmAdapter } from 'inferis-ml/adapters/web-llm';

if (browser) {
  createInferis({ adapter: webLlmAdapter() });
}
</script>

Or wrap with {#if browser} to conditionally render child components that use inferis stores.

Requirements

Svelte 4+
inferis-ml 1.0+
Browser with WebGPU or WASM support

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

inferis-svelte

Install

Quick Start

Setup

API Reference

createInferis(options)

getInferis()

useInferis()

useCapabilities()

useModel(task, config)

useInference<T>(model)

useStream<T>(model)

useMemoryBudget(intervalMs?)

Adapters

Examples

Image Classification

Capability Gate

Memory Monitor

SvelteKit

Requirements

License

`createInferis(options)`

`getInferis()`

`useInferis()`

`useCapabilities()`

`useModel(task, config)`

`useInference<T>(model)`

`useStream<T>(model)`

`useMemoryBudget(intervalMs?)`