npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@layerscale/layerscale

v0.3.0

Published

Client for the LayerScale inference server

Readme

@layerscale/layerscale

TypeScript client for the LayerScale inference server.

Install

npm install @layerscale/layerscale

Get a license key

You need a LayerScale license key to authenticate. Grab a free one at layerscale.ai/get-license, it takes about 10 seconds.

Usage

import { LayerScale } from '@layerscale/layerscale';

const client = new LayerScale('http://localhost:8080', {
  apiKey: 'LS-...', // or set LAYERSCALE_LICENSE_KEY env var
});

Complete example

End-to-end: create a session, push a few OHLCV candles, wait for them to be processed, and ask the model about the data.

import { LayerScale } from '@layerscale/layerscale';

const client = new LayerScale('http://localhost:8080', {
  apiKey: process.env.LAYERSCALE_LICENSE_KEY,
});

async function main() {
  // 1. Create a session with a system prompt and freeze it in the cache
  //    so it is not reprocessed on every incoming data update.
  const session = await client.sessions.create({
    type: 'ohlcv',
    prompt:
      'You are a real-time market analyst. You receive live OHLCV candles and ' +
      'answer questions about market direction in a single word when possible.',
    context: 4096,
    markPrefix: true,
  });

  // 2. Push a few OHLCV candles and wait for them to be decoded into the cache.
  //    For a reactive alternative, subscribe to the `data_updated` event on
  //    client.sessions.events() or client.sessions.stream() instead of using `wait`.
  const candles = [
    { o: 150.20, h: 150.80, l: 150.10, c: 150.70, v: 1_200_000, timestamp: 1_733_000_000, sym: 'AAPL' },
    { o: 150.70, h: 151.10, l: 150.60, c: 150.95, v:   900_000, timestamp: 1_733_000_060, sym: 'AAPL' },
    { o: 150.95, h: 151.40, l: 150.90, c: 151.30, v: 1_500_000, timestamp: 1_733_000_120, sym: 'AAPL' },
    { o: 151.30, h: 151.80, l: 151.20, c: 151.75, v: 1_100_000, timestamp: 1_733_000_180, sym: 'AAPL' },
    { o: 151.75, h: 152.10, l: 151.60, c: 152.00, v: 1_300_000, timestamp: 1_733_000_240, sym: 'AAPL' },
  ];
  await client.sessions.push(session.session_id, candles, { wait: true });

  // 3. Query the model about the data we just ingested.
  const answer = await client.sessions.query(session.session_id, {
    prompt: 'Is the market trending up or down?',
    max_tokens: 8,
    fast_answer: true,
  });
  console.log('Answer:', answer.text.trim());

  await client.sessions.delete(session.session_id);
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

OpenAI-compatible chat

const chat = await client.chat({
  messages: [{ role: 'user', content: 'Hello' }],
});

Anthropic-compatible messages

const msg = await client.message({
  messages: [{ role: 'user', content: 'Hello' }],
  max_tokens: 256,
});

Streaming

All streaming methods return async generators:

for await (const chunk of client.chatStream({ messages })) {
  process.stdout.write(chunk.choices[0]?.delta.content ?? '');
}

for await (const chunk of client.messageStream({ messages, max_tokens: 256 })) {
  if (chunk.type === 'content_block_delta') {
    process.stdout.write(chunk.delta.text);
  }
}

Sessions

const session = await client.sessions.create({
  type: 'ohlcv',
  prompt: 'You are a trading analyst.',
  flash: [
    { query: 'Is the trend bullish?' },
    { query: 'Is volume increasing?' },
  ],
  markPrefix: true,
});

const answer = await client.sessions.query(session.session_id, {
  prompt: 'What is the trend?',
  max_tokens: 256,
});

// Stream generation token-by-token
for await (const chunk of client.sessions.queryStream(session.session_id, { max_tokens: 256 })) {
  if (chunk.token) process.stdout.write(chunk.token);
}

Tool calling in sessions

Pass OpenAI-format messages + tools. The server applies the model's Jinja chat template, diffs against the session's cached tokens, and only decodes the delta. The tool_call_guide is cached per session so repeated turns with the same tool set skip re-tokenising tool names.

const tools = [{
  type: 'function' as const,
  function: {
    name: 'read_file',
    description: 'Read a file',
    parameters: {
      type: 'object',
      properties: { path: { type: 'string' } },
      required: ['path'],
    },
  },
}];

// Turn 1: ask the model to pick a tool.
const resp = await client.sessions.query(session.session_id, {
  messages: [
    { role: 'system', content: 'You are a coding agent.' },
    { role: 'user',   content: 'Read src/App.tsx' },
  ],
  tools,
  max_tokens: 256,
});

if (resp.tool_calls?.length) {
  const call = resp.tool_calls[0];
  console.log(`Tool: ${call.function.name}(${call.function.arguments})`);

  // Turn 2: feed the tool result back. Include the assistant's prior
  // tool_call so the template renders the matching tool_call_id header.
  const followUp = await client.sessions.query(session.session_id, {
    messages: [
      { role: 'system', content: 'You are a coding agent.' },
      { role: 'user',   content: 'Read src/App.tsx' },
      { role: 'assistant', content: null, tool_calls: [call] },
      { role: 'tool', tool_call_id: call.id, content: "import React from 'react'; ..." },
    ],
    tools,
    max_tokens: 256,
  });
  console.log(followUp.text);
}

Session management

const sessions = await client.sessions.list();
const state = await client.sessions.get(session.session_id);
await client.sessions.delete(session.session_id);

Continuous streaming

Push OHLCV data into the lock-free ring buffer for background processing:

await client.sessions.push(session.session_id, [
  { o: 150.5, h: 151.0, l: 150.0, c: 150.75, v: 1_000_000, timestamp: 1704067200, sym: 'AAPL' },
]);

const status = await client.sessions.streamStatus(session.session_id);
const stats  = await client.sessions.stats(session.session_id);

Flash queries

Register questions that are automatically evaluated in the background after each data update. Answers are cached for near-instant retrieval:

const registered = await client.sessions.flash(session.session_id, 'Is the trend bullish?');
// optional third argument caps the answer length in tokens (default 32):
await client.sessions.flash(session.session_id, 'Summarise the trend', 64);

const queries = await client.sessions.listFlash(session.session_id);
await client.sessions.unflash(session.session_id, registered.id);

Session events

Subscribe to real-time session events via SSE:

for await (const event of client.sessions.events(session.session_id)) {
  if (event.type === 'flash_ready') {
    console.log(event.query, event.value, event.confidence);
  } else if (event.type === 'data_updated') {
    console.log('new data version:', event.data_version);
  }
}

WebSocket streaming

A WebSocket combines data-push and event subscription on one connection. It does not auto-reconnect — wrap it in your own backoff loop for long-lived consumers:

const socket = client.sessions.stream(session.session_id);

socket.on('open', () => socket.push([candle]));
socket.on('flash_ready', (data) => console.log(data.query, data.value));
socket.on('data_updated', (data) => console.log('version:', data.data_version));
socket.on('error', (err) => console.error(err.message));
socket.on('close', () => {/* reconnect here if needed */});

Cancellation and timeouts

Every request method accepts an optional { signal } for cancellation. Non-streaming requests also have a default timeout (10 min by default, configurable via timeoutMs on the constructor; pass 0 to disable).

const client = new LayerScale(baseUrl, {
  apiKey: '...',
  timeoutMs: 30_000, // 30s default for non-streaming calls
});

const ac = new AbortController();
setTimeout(() => ac.abort(), 5_000);
await client.chat({ messages }, { signal: ac.signal });

Error handling

import { LayerScaleError } from '@layerscale/layerscale';

try {
  await client.sessions.query(sessionId, { max_tokens: 256 });
} catch (err) {
  if (err instanceof LayerScaleError) {
    console.error(err.status, err.body.error.message);
  }
}

API

| Method | Endpoint | Description | |--------|----------|-------------| | health() | GET /v1/health | Check whether the server is ready to serve requests. | | models() | GET /v1/models | List the models the server has loaded. | | chat(params) | POST /v1/chat/completions | OpenAI-compatible chat completion (single response). | | chatStream(params) | POST /v1/chat/completions (streaming) | OpenAI-compatible chat completion streamed as SSE chunks. | | complete(params) | POST /v1/completions | OpenAI-compatible legacy text completion. | | message(params) | POST /v1/messages | Anthropic-compatible messages API (single response). | | messageStream(params) | POST /v1/messages (streaming) | Anthropic-compatible messages API streamed as SSE chunks. | | sessions.create(params) | POST /v1/sessions/init | Create a streaming session with a system prompt and optional flash queries. | | sessions.list() | GET /v1/sessions | List all active sessions on the server. | | sessions.get(id) | GET /v1/sessions/:id/state | Inspect a session's state, token counts, and decoded context. | | sessions.delete(id) | DELETE /v1/sessions/:id | Delete a session and free its GPU context. | | sessions.append(id, text) | POST /v1/sessions/:id/append | Tokenize and append raw text to the session context. | | sessions.query(id, params) | POST /v1/sessions/:id/generate | Run a generation against the session's current context. | | sessions.queryStream(id, params) | POST /v1/sessions/:id/generate (streaming) | Stream a generation token-by-token as SSE chunks. | | sessions.markPrefix(id) | POST /v1/sessions/:id/mark_prefix | Freeze the current position as a reusable cache prefix. | | sessions.push(id, data, { wait? }) | POST /v1/sessions/:id/stream/push | Push streaming data (OHLCV, IoT, vitals, etc.) into the session's ring buffer. | | sessions.streamStatus(id) | GET /v1/sessions/:id/stream/status | Get processor status, queue depth, and ingestion metrics. | | sessions.stats(id) | GET /v1/sessions/:id/stats | Get computed statistics over the session's ingested data. | | sessions.flash(id, query, maxTokens?) | POST /v1/sessions/:id/flash | Register a flash query evaluated automatically after each data update. | | sessions.listFlash(id) | GET /v1/sessions/:id/flash | List the session's flash queries and their cached answers. | | sessions.unflash(id, queryId) | DELETE /v1/sessions/:id/flash/:queryId | Remove a flash query from the session. | | sessions.events(id) | GET /v1/sessions/:id/events (SSE) | Subscribe to session events (flash_ready, data_updated) over SSE. | | sessions.stream(id) | WS /v1/sessions/:id/ws | Open a WebSocket for bidirectional data push and event delivery. |