@locallm/types

v0.6.2

Published

12 days ago

Shared data types for the LocalLm api

Downloads

588

0High
0Medium
0Low

synw

LocalLm types

Shared data types for the LocalLm api

:books: Api doc

Install

npm install @locallm/types

Overview

This package provides TypeScript interfaces and type definitions for the LocalLm library. These types define the structure of model configurations, inference parameters, provider interfaces, and other core components used throughout the LocalLm ecosystem.

Types Documentation

Model Configuration

ModelConf

Represents the configuration of a model.

interface ModelConf {
  name: string;
  ctx?: number;
  info?: { size: string, quant: string };
  extra?: Record<string, any>;
}

name: string - The unique name of the model.
ctx: number | undefined - The context window length, typically used to define how much of the previous data to consider.
info: { size: string, quant: string } | undefined - Some meta info about the model: parameter size and quantization level.
extra: Record<string, any> | undefined - Extra parameters like urls for browser models.

Example:

{
  "name": "gpt-3",
  "ctx": 2048,
  "info": { "size": "175B", "quant": "q4_0" },
  "extra": { "url": "http://example.com/model" }
}

ModelTemplate

Template information for model.

interface ModelTemplate {
  name: string;
  ctx: number;
}

name: string - The name of the template.
ctx: number - The context window size for the model.

Example:

{
  "name": "chatml",
  "ctx": 2048
}

ModelState

Represents the state of the available models on server.

interface ModelState {
  models: Record<string, ModelTemplate>;
  isModelLoaded: boolean;
  loadedModel: string;
  ctx: number;
}

models: Record<string, ModelTemplate> - The models info object (name, template name, context window size).
isModelLoaded: boolean - Indicates whether a model is loaded or not.
loadedModel: string - The loaded model name, empty if no model is loaded.
ctx: number - The loaded model context window size, 0 if no model is loaded.

Example:

{
  "models": { "gpt3": { "name": "gpt-3", "ctx": 2048 } },
  "isModelLoaded": true,
  "loadedModel": "gpt-3",
  "ctx": 2048
}

Inference

InferenceParams

Describes the parameters for making an inference request.

interface InferenceParams {
  stream?: boolean;
  model?: ModelConf;
  template?: string;
  max_tokens?: number;
  top_k?: number;
  top_p?: number;
  min_p?: number;
  temperature?: number;
  repeat_penalty?: number;
  tfs?: number;
  stop?: Array<string>;
  grammar?: string;
  tsGrammar?: string;
  schema?: Record<string, any>;
  images?: Array<string>;
  extra?: Record<string, any>;
}

stream: boolean | undefined - Indicates if results should be streamed progressively.
model: ModelConf | undefined - The model configuration details for inference.
template: string | undefined - The template to use, for the backends that support it.
max_tokens: number | undefined - The number of predictions to return.
top_k: number | undefined - Limits the result set to the top K results.
top_p: number | undefined - Filters results based on cumulative probability.
min_p: number | undefined - The minimum probability for a token to be considered, relative to the probability of the most likely token.
temperature: number | undefined - Adjusts randomness in sampling; higher values mean more randomness.
repeat_penalty: number | undefined - Adjusts penalty for repeated tokens.
tfs: number | undefined - Set the tail free sampling value.
stop: Array<string> | undefined - List of stop words or phrases to halt predictions.
grammar: string | undefined - The gnbf grammar to use for grammar-based sampling.
tsGrammar: string | undefined - A Typescript interface to be converted to a gnbf grammar to use for grammar-based sampling.
schema: Record<string, any> | undefined - A json schema to format the output.
images: Array<string> | undefined - The base64 images data (for multimodal models).
extra: Record<string, any> | undefined - Extra parameters to include in the payload.

Example:

{
  "stream": true,
  "model": { "name": "qwen3:4b", "ctx": 8192 },
  "template": "chatml",
  "max_tokens": 150,
  "top_k": 50,
  "top_p": 0.9,
  "min_p": 0.01,
  "temperature": 0.7,
  "repeat_penalty": 1.2,
  "tfs": 0.8,
  "stop": ["###"],
  "grammar": "default_grammar",
  "images": ["data:image/png;base64,..."]
}

InferenceResult

Represents the result returned after an inference request.

interface InferenceResult {
  text: string;
  stats: InferenceStats;
  serverStats: Record<string, any>;
  toolCalls?: Array<ToolCallSpec>;
}

text: string - The textual representation of the generated inference.
stats: InferenceStats - Additional statistics or metadata related to the inference.
serverStats: Record<string, any> - Additional server-related statistics.
toolCalls: Array<ToolCallSpec> | undefined - Tool calls made during inference.

Example:

{
  "text": "The quick brown fox jumps over the lazy dog.",
  "stats": {
    "ingestionTime": 150,
    "inferenceTime": 300,
    "totalTime": 450,
    "ingestionTimeSeconds": 0.15,
    "inferenceTimeSeconds": 0.3,
    "totalTimeSeconds": 0.45,
    "totalTokens": 200,
    "tokensPerSecond": 444
  },
  "serverStats": { "someServerKey": "someServerValue" },
  "toolCalls": [{ "id": "1", "name": "getWeather", "arguments": { "location": "New York" } }]
}

InferenceStats

Represents the statistics of an inference.

interface InferenceStats extends IngestionStats {
  totalTime: number;
  inferenceTime: number;
  inferenceTimeSeconds: number;
  totalTimeSeconds: number;
  totalTokens: number;
  tokensPerSecond: number;
}

ingestionTime: number - The time taken to ingest the input data in milliseconds.
inferenceTime: number - The time taken to perform the inference in milliseconds.
totalTime: number - The total time taken to perform the inference in milliseconds.
ingestionTimeSeconds: number - The time taken to ingest the input data in seconds.
inferenceTimeSeconds: number - The time taken to perform the inference in seconds.
totalTimeSeconds: number - The total time taken to perform the inference in seconds.
totalTokens: number - The total number of tokens processed.
tokensPerSecond: number - The number of tokens processed per second.

Example:

{
  "ingestionTime": 150,
  "inferenceTime": 300,
  "totalTime": 450,
  "ingestionTimeSeconds": 0.15,
  "inferenceTimeSeconds": 0.3,
  "totalTimeSeconds": 0.45,
  "totalTokens": 200,
  "tokensPerSecond": 444
}

IngestionStats

Represents the statistics of an inference prompt ingestion time.

interface IngestionStats {
  ingestionTime: number;
  ingestionTimeSeconds: number;
}

ingestionTime: number - The time taken to ingest the input data in milliseconds.
ingestionTimeSeconds: number - The time taken to ingest the input data in seconds.

Example:

{
  "ingestionTime": 150,
  "ingestionTimeSeconds": 0.15
}

InferenceOptions

Options for inference requests.

interface InferenceOptions {
  debug?: boolean;
  verbose?: boolean;
  tools?: Array<ToolSpec>;
  history?: Array<HistoryTurn>;
  system?: string;
  assistant?: string;
}

debug: boolean | undefined - Enable debug mode for detailed logging.
verbose: boolean | undefined - Enable verbose output.
tools: Array<ToolSpec> | undefined - Array of available tools for the conversation.
history: Array<HistoryTurn> | undefined - Conversation history to include in the inference.
system: string | undefined - System message to set the context for the conversation.
assistant: string | undefined - Assistant message to include in the context.

Example:

{
  "debug": true,
  "tools": [weatherTool],
  "history": [
    { "user": "Hello", "assistant": "Hi there!" }
  ],
  "system": "You are a helpful assistant."
}

Conversation and Tools

HistoryTurn

Represents a single turn in a conversation history.

interface HistoryTurn {
  user?: string;
  assistant?: string;
  tools?: { calls: Array<ToolCallSpec>, results: Array<{ id: string, content: string }> };
}

user: string | undefined - The user's message in this turn.
assistant: string | undefined - The assistant's response in this turn.
tools: { calls: Array<ToolCallSpec>, results: Array<{ id: string, content: string }> } | undefined - Tool calls and results for this turn.

Example:

{
  "user": "What's the weather like?",
  "assistant": "The weather is sunny with a temperature of 72°F.",
  "tools": {
    "calls": [{ "id": "1", "name": "getWeather", "arguments": { "location": "New York" } }],
    "results": [{ "id": "1", "content": "Sunny, 72°F" }]
  }
}

ToolCallSpec

Represents a tool call specification.

interface ToolCallSpec {
  id?: string;
  name: string;
  arguments?: {
    [key: string]: string;
  };
}

id: string | undefined - The unique identifier for the tool call.
name: string - The name of the tool being called.
arguments: Record<string, string> | undefined - The arguments to pass to the tool.

Example:

{
  "id": "1",
  "name": "getWeather",
  "arguments": { "location": "New York" }
}

ToolDefSpec

Specification for a tool that can be used within the conversation.

interface ToolDefSpec {
  name: string;
  description: string;
  arguments: {
    [key: string]: {
      description: string;
      type?: string;
      required?: boolean;
    };
  };
}

name: string - The name of the tool.
description: string - A description of what the tool does.
arguments: Record<string, { description: string, type?: string, required?: boolean }> - Arguments required by the tool, with descriptions for each argument.

Example:

{
  "name": "WeatherFetcher",
  "description": "Fetches weather information.",
  "arguments": {
    "location": {
      "description": "The location for which to fetch the weather.",
      "required": true
    }
  }
}

ToolSpec

Represents a tool specification with an execute function.

interface ToolSpec extends ToolDefSpec {
  execute: <O = any>(args: { [key: string]: string; } | undefined) => Promise<O>;
}

execute: (args: Record<string, string> | undefined) => Promise<any> - The function to execute the tool with the provided arguments.

Example:

const toolSpec: ToolSpec = {
  name: "WeatherFetcher",
  description: "Fetches weather information.",
  arguments: {
    location: {
      description: "The location for which to fetch the weather.",
      required: true
    }
  },
  execute: async (args) => {
    const { location } = args || {};
    return `Weather in ${location}: Sunny, 72°F`;
  }
}

Language Model Providers

LmProvider

Defines the structure and behavior of an LM Provider.

interface LmProvider {
  name: string;
  api: ReturnType<typeof useApi>;
  serverUrl: string;
  apiKey: string;
  model: ModelConf;
  models: Array<ModelConf>;
  info: () => Promise<Record<string, any>>;
  modelsInfo: () => Promise<void>;
  loadModel: (name: string, ctx?: number, urls?: string | string[], onLoadProgress?: OnLoadProgress) => Promise<void>;
  infer: (prompt: string, params: InferenceParams, options?: InferenceOptions) => Promise<InferenceResult>;
  abort: () => Promise<void>;
  onToken?: (t: string) => void;
  onStartEmit?: (data: IngestionStats) => void;
  onEndEmit?: (result: InferenceResult) => void;
  onError?: (err: string) => void;
  defaults?: LmDefaults;
}

name: string - Identifier for the LM provider.
api: ReturnType<typeof useApi> - API utility being used.
serverUrl: string - The URL endpoint for the provider's server.
apiKey: string - The key used for authentication with the provider's API.
model: ModelConf - Active model configuration.
models: Array<ModelConf> - List of available model configurations.
info: () => Promise<Record<string, any>> - Retrieves information about available server config.
modelsInfo: () => Promise<void> - Retrieves information about available models.
loadModel: (name: string, ctx?: number, urls?: string | string[], onLoadProgress?: OnLoadProgress) => Promise<void> - Loads a model by name, with optional context.
infer: (prompt: string, params: InferenceParams, options?: InferenceOptions) => Promise<InferenceResult> - Makes an inference based on provided prompt and parameters.
abort: () => Promise<void> - Aborts a currently running inference task.
onToken: (t: string) => void | undefined - Callback when a new token is received.
onStartEmit: (data: IngestionStats) => void | undefined - Callback triggered when inference starts.
onEndEmit: (result: InferenceResult) => void | undefined - Callback triggered when inference ends.
onError: (err: string) => void | undefined - Callback triggered on errors during inference.
defaults: LmDefaults | undefined - Default settings for this provider.

Example:

{
  "name": "koboldcpp",
  "api": { useApi: () => {} },
  "serverUrl": "https://myserver.com/api",
  "apiKey": "your-api-key",
  "model": { "name": "gpt-3", "ctx": 2048 },
  "models": [{ "name": "gpt-3", "ctx": 2048 }],
  "info": async () => ({ "config": "some-config" }),
  "modelsInfo": async () => {},
  "loadModel": async (name, ctx, urls, onLoadProgress) => {},
  "infer": async (prompt, params, options) => ({ 
    "text": "result", 
    "stats": { 
      "ingestionTime": 0, 
      "ingestionTimeSeconds": 0,
      "inferenceTime": 0, 
      "inferenceTimeSeconds": 0,
      "totalTime": 0,
      "totalTimeSeconds": 0,
      "totalTokens": 0,
      "tokensPerSecond": 0
    }, 
    "serverStats": {} 
  }),
  "abort": async () => {},
  "onToken": (t) => console.log(t),
  "onStartEmit": (data) => console.log(data),
  "onEndEmit": (result) => console.log(result),
  "onError": (err) => console.error(err)
}

LmProviderParams

Parameters required when creating a new LM provider instance.

interface LmProviderParams {
  name: string;
  serverUrl: string;
  apiKey?: string;
  onToken?: (t: string) => void;
  onStartEmit?: (data: IngestionStats) => void;
  onEndEmit?: (result: InferenceResult) => void;
  onError?: (err: string) => void;
  defaults?: LmDefaults;
}

name: string - Identifier for the LM provider.
serverUrl: string - The URL endpoint for the provider's server.
apiKey: string | undefined - The key used for authentication.
onToken: (t: string) => void | undefined - Callback when a new token is received.
onStartEmit: (data: IngestionStats) => void | undefined - Callback triggered when inference starts.
onEndEmit: (result: InferenceResult) => void | undefined - Callback triggered when inference ends.
onError: (err: string) => void | undefined - Callback triggered on errors.
defaults: LmDefaults | undefined - Default settings.

Example:

{
  "name": "koboldcpp",
  "serverUrl": "http://example.com/api",
  "apiKey": "your-api-key",
  "onToken": (t) => console.log(t),
  "onStartEmit": (data) => console.log(data),
  "onEndEmit": (result) => console.log(result),
  "onError": (err) => console.error(err)
}

LmParams

Parameters for initializing a Language Model.

interface LmParams {
  providerType: LmProviderType;
  serverUrl: string;
  apiKey?: string;
  onToken?: (t: string) => void;
  onStartEmit?: (data: IngestionStats) => void;
  onEndEmit?: (result: InferenceResult) => void;
  onError?: (err: string) => void;
  defaults?: LmDefaults;
}

providerType: LmProviderType - Type of provider.
serverUrl: string - The URL endpoint for the LM service.
apiKey: string | undefined - Optional API key for authentication.
onToken: (t: string) => void | undefined - Callback when a new token is received.
onStartEmit: (data: IngestionStats) => void | undefined - Callback triggered when inference starts.
onEndEmit: (result: InferenceResult) => void | undefined - Callback triggered when inference ends.
onError: (err: string) => void | undefined - Callback triggered on errors.
defaults: LmDefaults | undefined - Default settings.

Example:

{
  "providerType": "koboldcpp",
  "serverUrl": "http://example.com/api",
  "onToken": (t) => console.log(t),
  "apiKey": "your-api-key",
  "onStartEmit": (data) => console.log(data),
  "onEndEmit": (result) => console.log(result),
  "onError": (err) => console.error(err)
}

LmDefaults

Default parameters that can be used with an LM provider.

interface LmDefaults {
  model?: ModelConf;
  inferenceParams?: InferenceParams;
}

model: ModelConf | undefined - Default model conf to use.
inferenceParams: InferenceParams | undefined - Default inference parameters.

Example:

{
  "model": { "name": "gpt-3", "ctx": 2048 },
  "inferenceParams": { "max_tokens": 150, "top_k": 50 }
}

Progress and Callback Types

OnLoadProgressBasic

Represents the basic progress of a load operation.

interface OnLoadProgressBasic {
  total: number;
  loaded: number;
}

total: number - The total number of items to load.
loaded: number - The number of items that have been loaded so far.

Example:

{
  "total": 100,
  "loaded": 50
}

OnLoadProgressFull

Represents the full progress of a load operation, including percentage.

interface OnLoadProgressFull extends OnLoadProgressBasic {
  percent: number;
}

total: number - The total number of items to load.
loaded: number - The number of items that have been loaded so far.
percent: number - The percentage of items that have been loaded so far.

Example:

{
  "total": 100,
  "loaded": 50,
  "percent": 50
}

OnLoadProgress

Type definition for a progress callback function with full details.

type OnLoadProgress = (data: OnLoadProgressFull) => void;

Example:

const onLoadProgress: OnLoadProgress = (data) => {
  console.log(`Loaded ${data.loaded} of ${data.total} (${data.percent}%)`);
};

BasicOnLoadProgress

Type definition for a basic progress callback function.

type BasicOnLoadProgress = (data: OnLoadProgressBasic) => void;

Example:

const onLoadProgress: BasicOnLoadProgress = (data) => {
  console.log(`Loaded ${data.loaded} of ${data.total}`);
};

Provider Types

LmProviderType

Represents the type of LM provider.

type LmProviderType = "llamacpp" | "koboldcpp" | "ollama" | "openai" | "browser";

Example:

{
  "providerType": "koboldcpp"
}

Usage Examples

Basic Model Configuration

import { ModelConf } from '@locallm/types';

const modelConfig: ModelConf = {
  name: 'gpt-3.5-turbo',
  ctx: 4096,
  info: { size: '175B', quant: 'q4_0' }
};

Making an Inference Request

import { 
  InferenceParams, 
  InferenceResult, 
  ModelConf 
} from '@locallm/types';

const model: ModelConf = { name: 'gpt-3.5-turbo', ctx: 4096 };
const params: InferenceParams = {
  stream: true,
  model,
  max_tokens: 150,
  temperature: 0.7,
  stop: ['###']
};

// This would be used with an actual LM provider
const result: InferenceResult = await provider.infer(
  'Tell me about artificial intelligence',
  params
);

Using Conversation History

import { 
  HistoryTurn, 
  InferenceOptions 
} from '@locallm/types';

const history: Array<HistoryTurn> = [
  {
    user: 'Hello, how are you?',
    assistant: 'I am doing well, thank you for asking!'
  },
  {
    user: 'What can you help me with?',
    assistant: 'I can help with various tasks like answering questions, providing information, and assisting with creative writing.'
  }
];

const options: InferenceOptions = {
  history,
  system: 'You are a helpful assistant that responds briefly.'
};

Working with Tools

import { 
  ToolSpec, 
  ToolDefSpec,
  ToolCallSpec 
} from '@locallm/types';

const weatherTool: ToolSpec = {
  name: 'getWeather',
  description: 'Get current weather information for a location',
  arguments: {
    location: {
      description: 'The city and state to get weather for',
      required: true
    }
  },
  execute: async (args) => {
    const { location } = args || {};
    // In a real implementation, this would call a weather API
    return { temperature: '72°F', condition: 'Sunny' };
  }
};

// Tool call example
const toolCall: ToolCallSpec = {
  id: '1',
  name: 'getWeather',
  arguments: { location: 'New York, NY' }
};

Setting up a Language Model Provider

import { 
  LmProvider, 
  LmProviderParams, 
  LmDefaults,
  ModelConf 
} from '@locallm/types';

const providerParams: LmProviderParams = {
  name: 'koboldcpp',
  serverUrl: 'http://localhost:5001',
  apiKey: 'your-api-key',
  onToken: (token) => console.log(`Token: ${token}`),
  onStartEmit: (stats) => console.log(`Started ingestion: ${stats.ingestionTime}ms`),
  onEndEmit: (result) => console.log(`Generated ${result.text}`),
  onError: (error) => console.error(`Error: ${error}`),
  defaults: {
    model: { name: 'gpt-3.5-turbo', ctx: 4096 },
    inferenceParams: { temperature: 0.7, max_tokens: 100 }
  }
};

// This would be used to initialize a provider
const provider: LmProvider = await createLmProvider(providerParams);

Loading a Model with Progress Tracking

import { OnLoadProgress } from '@locallm/types';

const onProgress: OnLoadProgress = (data) => {
  console.log(`Loading model: ${data.percent}% (${data.loaded}/${data.total})`);
};

await provider.loadModel(
  'gpt-3.5-turbo',
  4096,
  undefined, // urls
  onProgress
);

Property Reference Table

| Interface | Key Properties | Required | |-----------|----------------|----------| | ModelConf | name, ctx | name | | InferenceParams | stream, model, template, max_tokens, etc. | none | | InferenceResult | text, stats, serverStats | all | | InferenceStats | ingestionTime, inferenceTime, totalTime, etc. | all | | LmProvider | name, api, serverUrl, apiKey, model, models, etc. | most | | ToolSpec | name, description, arguments, execute | all | | LmProviderType | providerType | - |

:books: Api doc

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

LocalLm types

Install

Overview

Types Documentation

Model Configuration

ModelConf

ModelTemplate

ModelState

Inference

InferenceParams

InferenceResult

InferenceStats

IngestionStats

InferenceOptions

Conversation and Tools

HistoryTurn

ToolCallSpec

ToolDefSpec

ToolSpec

Language Model Providers

LmProvider

LmProviderParams

LmParams

LmDefaults

Progress and Callback Types

OnLoadProgressBasic

OnLoadProgressFull

OnLoadProgress

BasicOnLoadProgress

Provider Types

LmProviderType

Usage Examples

Basic Model Configuration

Making an Inference Request

Using Conversation History

Working with Tools

Setting up a Language Model Provider

Loading a Model with Progress Tracking

Property Reference Table