sis-tools

v0.2.0

Published

a month ago

Semantic Integration System - Intelligent tool resolution for LLMs using embeddings

Downloads

436

0High
0Medium
0Low

ifesol

llm tools mcp semantic embeddings ai function-calling openai anthropic vector-search custom-storage hooks validators formatting

sis-tools (Node.js)

Semantic Integration System (SIS) is a tool-resolution layer that uses embeddings to select the most relevant tools for a query at runtime.

Instead of sending every tool schema to an LLM up front, you resolve the top-k tools for the user’s query and only inject the relevant schemas.

Why SIS?

Context savings: Only send relevant tool schemas to the LLM
Scalability: Register hundreds of tools without bloating context
Semantic matching: Uses embeddings to find tools by intent, not keywords
Flexible: Works with OpenAI, Anthropic, Google, Cohere, and custom providers

Install

npm install sis-tools

SIS supports optional embedding providers. Install only what you use:

npm install openai          # OpenAI embeddings
npm install @google/generative-ai  # Google embeddings
npm install cohere-ai       # Cohere embeddings

Quick start

import { SIS } from "sis-tools";

const sis = new SIS({
  embeddingProvider: "openai",
  providerOptions: {
    apiKey: process.env.OPENAI_API_KEY,
  },
  defaultTopK: 5,
  defaultThreshold: 0.3,
});

sis.register({
  name: "web_search",
  description: "Search the web for current information",
  parameters: { query: { type: "string" } },
  semanticHints: ["google", "lookup", "find online"],
  handler: async ({ query }) => {
    // your implementation
    return String(query);
  },
});

await sis.initialize();

const tools = await sis.resolve("search for latest news");
// tools: [{ name, schema, score, handler? }, ...]

Configuration

Constructor options

interface SISOptions {
  embeddingProvider?: "openai" | "cohere" | "google" | EmbeddingProvider;
  providerOptions?: {
    apiKey?: string;
    model?: string;
    dimensions?: number;
    [key: string]: unknown;
  };
  defaultTopK?: number;      // default: 5
  defaultThreshold?: number; // default: 0.3
  similarity?: SimilarityFunction;
  scoring?: ScoringFunction;
  validators?: ValidatorRegistry;
  validateOnRegister?: boolean;
  validateOnExecute?: boolean;
}

Tool registration

interface RegisterOptions {
  name: string;
  description: string;
  parameters?: ToolParameters;
  handler?: ToolHandler;
  semanticHints?: string[];
  examples?: ToolExample[];
  metadata?: ToolMetadata;
}

Resolve formats

await sis.resolve("query", { format: "openai" });
await sis.resolve("query", { format: "anthropic" });
await sis.resolve("query", { format: "raw" });

Examples

Use with OpenAI function calling

import { OpenAI } from "openai";
import { SIS } from "sis-tools";

const openai = new OpenAI();
const sis = new SIS({ embeddingProvider: "openai" });

sis.register({
  name: "web_search",
  description: "Search the web",
  parameters: { query: { type: "string" } },
  handler: async ({ query }) => searchApi(query),
});

await sis.initialize();

async function runAgent(userMessage: string) {
  const tools = await sis.resolve(userMessage, { format: "openai" });

  const response = await openai.chat.completions.create({
    model: "gpt-4.1",
    messages: [{ role: "user", content: userMessage }],
    tools,
  });

  const toolCall = response.choices[0]?.message?.tool_calls?.[0];
  if (toolCall) {
    const result = await sis.execute(
      toolCall.function.name,
      JSON.parse(toolCall.function.arguments)
    );
    return result;
  }
}

Use with Anthropic tool use

import Anthropic from "@anthropic-ai/sdk";
import { SIS } from "sis-tools";

const anthropic = new Anthropic();
const sis = new SIS({ embeddingProvider: "openai" });

await sis.initialize();

async function runAgent(userMessage: string) {
  const tools = await sis.resolve(userMessage, { format: "anthropic" });

  const response = await anthropic.messages.create({
    model: "claude-3-5-sonnet-20241022",
    max_tokens: 1024,
    messages: [{ role: "user", content: userMessage }],
    tools,
  });

  const toolUse = response.content.find((b) => b.type === "tool_use");
  if (toolUse) {
    const result = await sis.execute(toolUse.name, toolUse.input);
    return result;
  }
}

Bring your own embeddings

import { SIS } from "sis-tools";
import type { EmbeddingProvider } from "sis-tools";

class MyEmbeddings implements EmbeddingProvider {
  readonly dimensions = 768;

  async embed(text: string): Promise<number[]> {
    const res = await fetch("https://my-embedding-service/embed", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ text }),
    });
    const { embedding } = await res.json();
    return embedding;
  }

  async embedBatch(texts: string[]): Promise<number[][]> {
    const res = await fetch("https://my-embedding-service/embed-batch", {
      method: "POST",
      headers: { "Content-Type": "application/json" },
      body: JSON.stringify({ texts }),
    });
    const { embeddings } = await res.json();
    return embeddings;
  }
}

const sis = new SIS({ embeddingProvider: new MyEmbeddings() });

Custom scoring (priority boost)

import { SIS, PriorityScoring } from "sis-tools";

const sis = new SIS({
  embeddingProvider: "openai",
  scoring: new PriorityScoring(1.0),
});

sis.register({
  name: "important_tool",
  description: "An important tool",
  metadata: { priority: 2.0 },
});

await sis.initialize();

Validation on register/execute

import { SIS, createStrictValidator } from "sis-tools";

const sis = new SIS({
  embeddingProvider: "openai",
  validators: createStrictValidator(),
  validateOnRegister: true,
  validateOnExecute: true,
});

// Throws ValidationError if tool schema is invalid
sis.register({
  name: "bad_tool",
  description: "x", // too short
  parameters: {},
});

API

SIS

class SIS {
  constructor(options: SISOptions)

  register(options: RegisterOptions): Tool
  store(options: StoreOptions): void
  async initialize(): Promise<void>

  async resolve(query: string, options?: ResolveOptions): Promise<ResolvedTool[]>
  async resolveOne(query: string, threshold?: number): Promise<ResolvedTool | null>

  async execute(toolName: string, params: object): Promise<unknown>

  getTool(name: string): Tool | undefined
  listTools(): string[]
  get toolCount(): number

  // Customization
  get hooks(): HookRegistry
  get validators(): ValidatorRegistry | undefined
  get similarity(): SimilarityFunction
  set similarity(fn: SimilarityFunction)
  get scoring(): ScoringFunction
  set scoring(fn: ScoringFunction)
  registerHook(hook: Hook): void
  unregisterHook(hook: Hook): boolean
}

ResolveOptions

interface ResolveOptions {
  topK?: number;
  threshold?: number;
  format?: "raw" | "openai" | "anthropic" | string | ToolFormatter;
}

Troubleshooting

Provider not found

If you see an error like requires the openai package, install the peer dependency:

npm install openai

No tools returned

Lower defaultThreshold (try 0.0 to see all matches)
Increase defaultTopK
Check tool descriptions and semanticHints for clarity

Slow initialization

Use a faster embedding model (e.g., text-embedding-3-small)
Consider caching embeddings or using a persistent vector store

License

MIT