@martin-pi/sails-hook-llm

v1.0.2

Published

7 months ago

Local LLM hook for Sails

0High
0Medium
0Low

martin-pi

sails sailsjs llm hook llama gguf

Sails.js Local LLM Hook

A hook for Sails JS, using node-llama-cpp. This is pretty much just an adapter to enable local LLM functionality (including function calls!) within Sails applications.

This is meant for simple applications, shoestring-budget side projects, and proofs of concept. I do not recommend running a local LLM in a production app.

Large Language Models are nondeterministic and inherently untrustworthy. This hook will have unexpected behavior.

A terminal displays the output of a sails app as a request is simulated. An LLM runs a function and responds with the result.

Installation

Install using npm install -s @martin-pi/sails-hook-llm
Now obtain a .gguf model to load with your app. I tested with a quantized Meta-Llama-3-8B-Instruct.Q3_K_M.gguf model.
Place your .gguf in your sails app, or somewhere you can manage it. Set sails.config.llm.modelPath to point at your model.
- I place my models in a directory named llm, which is alongside my config and api folders. With that setup, my modelPath is ./llm/Meta-Llama-3-8B-Instruct.Q3_K_M.gguf
- You should probably also add your llm directory to your .gitignore. These models are giant, and should not be checked into version control.
Run sails lift and ensure that the model loads. Depending on your hardware, you may need to give this hook extra time.
If you want to customize the settings, create a config file at config/llm.js. See the Configuration section of this doc for more details on configuration options.

Usage

`sails.hooks.llm.prompt(messages, systemPrompt, functionNames, promptOptions)`

Parameter | Type | Details -------------- | ------------------- |:--------------------------------- messages | object[] | An array of messages formatted as { role: '', content: '' } systemPrompt | string? | An optional prompt, which will override any user-submitted prompt functionNames | string[]? | An optional array of registered function names which the assistant may call. By default, all registered functions are available. Provide an empty array to disallow all functions. To register a function, look at sails.hooks.llm.registerFunction() promptOptions | object? | Custom

For example, this is a minimal call which should get a response from the LLM:

let assistantResponse = await sails.hooks.llm.prompt([{role:'user',content:'Hello!'}]);

By default, the LLM will accept system prompts from the messages array. You can override this by providing your own. This call should respond with a rhyme:

let willRhyme = await sails.hooks.llm.prompt(
  [ {role:'system',content:'Respond without rhyming.'}, {role:'user',content:'Hello!'} ], 
  'Respond only in rhymes.'
);

By default, the LLM will have access to every registered function. You can provide an array of functionNames to restrict the LLM's access. You can also call sails.hooks.llm.listFunctions() to get a list of all function names.

console.log(sails.hooks.llm.listFunctions()); // ['getCurrentTime', 'getUserInfo']
let timeResponse = await sails.hooks.llm.prompt([{role:'user',content:'What time is it?'}]); 
// timeResponse should include the current time.
let noFunctionResponse = await sails.hooks.llm.prompt([{role:'user',content:'What time is it?'}], undefined, []);
// noFunctionsResponse will probably include an incorrect time, because the LLM will try to guess.
let userInfoResponse = await sails.hooks.llm.prompt([{role:'user',content:'Who am I?'}], undefined, ['getUserInfo']);
// userInfoResponse should include some information about the current user.

You can also provide custom prompt options to override the default ones in your config.

let willBeShort = await sails.hooks.llm.prompt(
  [ {role:'user',content:'Tell me about '} ], 
  'Respond with at least one paragraph.',
  undefined,
  { maxTokens: 128 }
);

`sails.hooks.llm.registerFunction(name, config)`

You can completely disable function calling by setting sails.config.llm.useFunctions to false. Provide some details about a function to create a ChatSessionFunction object which can be called by some models. This should be called in config/bootstrap.js or elsewhere in your application before calling the LLM. It returns A chatSessionFunction for use with llama. Parameter | Type | Details -------------- | ------------------- |:--------------------------------- name | string | A name for your function, such as "callMyFunction". config | object | A configuration object, containing a description of the function, a description of each parameter, and finally a handler function which the LLM will actually call.

For example, this is a function that allows the LLM to understand the current time of day:

await sails.hooks.llm.registerFunction('getCurrentDate', {
  description: "Get today's date, in ISO date format.",
  params: {
    type: "object",
    properties: {},
  },
  async handler(params) {
    return new Date().toISOString();
  },
});

let testFunctionCalling = await sails.hooks.llm.prompt(
  [{role:'user',content:'What time is it?'}],
  'You are a helpful assistant.',
  ['getCurrentDate']
);
// timeResponse should contain the correct time.

I recommend building your functions as sails helpers, and calling them within handler functions. This way, you can also call these functions in your business logic.

Configuration

If you want to customize the behavior of this hook, you can do so through these options. This example file includes the settings I used to test this hook.

module.exports.llm = {
  _hookTimeout: 40000, // Wait 40 seconds for the model to load.

  modelPath: './llm/Meta-Llama-3-8B-Instruct.Q3_K_M.gguf', // Where is your model located?

  modelOptions: { // These options are passed to llama.cpp when loading the model.
    //gpu: false,
    vramPadding: (total) => { return Math.min(total / 4); }, // leave at least 25% of vram open. 
  },

  promptOptions: { // These options are passed to the llm with each prompt by default.
    maxTokens: 512,
    budgets: {
      thoughtTokens: 256,
    },
  },

  expose: true, // If true, register an action at 'POST /llm/prompt' which calls the llm.
  logRequests: true, // If true, prompts will be logged inside of the prompt function.
  useFunctions: true, // If true, functions will be provided to the llm.

  defaultSystemPrompt: 'You are a helpful assistant.', // If no other prompts are provided, this will be used.
};

Example API

By default, this hook exposes an API endpoint to act as an example and allow you to test your LLM. This may be all you need, but in production I would recommend disabling this endpoint and creating your own endpoint. This behavior can be disabled by setting sails.config.llm.expose to false.

`POST` `http://localhost:1337/llm/prompt`

Request Body:

{
  "context": [
    {
      "role":"user",
      "content":"Hello!"
    },
    {
      "role":"assistant",
      "content":"How can I help?"
    },
    {
      "role":"user",
      "content":"What is the time?"
    }
  ]
}

Response Body:

{
  "status": "success",
  "data": {
    "role": "assistant",
    "content": "The current date is 2025-09-23.",
    "processingTime": 2.1,
    "sentAt": "2025-09-23T01:06:51.991Z"
  }
}

This endpoint assumes you will handle the chat history on the client side. Each call to the LLM must include the entire chat history from both sides of the conversation.

Debugging

If you're running into issues with a model, try loading it in llama.cpp standalone, or Ollama before looking at sails.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme