offline-intelligence

v0.1.7

Published

4 days ago

Private On-Device Inference Engine — run LLMs locally with zero configuration

0High
0Medium
0Low

offlineintelligence

llm ai inference offline local on-device private gguf llama

Offline Intelligence

Private On-Device Inference Engine — run LLMs locally with zero configuration.

Installation

npm install offline-intelligence

Quick Start

const { OfflineIntelligence } = require("offline-intelligence");

// Create SDK — auto-detects your hardware (NVIDIA, AMD, Intel, Apple Silicon)
const sdk = new OfflineIntelligence();

// Download inference engine (first run only, ~200MB, stored in AppData)
sdk.ensureReady();

// Load a GGUF model and start inference
sdk.loadModel("path/to/model.gguf");

// Chat
const response = sdk.chat("What is the capital of France?");
console.log(response.content);

// Cleanup
sdk.close();

Configuration

const sdk = new OfflineIntelligence({
  ctx_size: 4096,       // Context window size
  gpu_layers: 28,       // GPU layers to offload (0 = CPU only)
  threads: 8,           // CPU threads
  batch_size: 512,      // Batch size for prompt processing
  env_file: ".env",     // Optional .env file (does NOT pollute your environment)
});

Full Message Format

const response = sdk.chat([
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "Explain quantum computing in one sentence." }
], {
  maxTokens: 500,
  temperature: 0.5
});
console.log(response.content);

How It Works

new OfflineIntelligence() — Detects GPU hardware, creates data directories. Instant.
ensureReady() — Downloads the best inference engine for your hardware. First run only.
loadModel(path) — Spawns a local inference server on a free port. Manages all DLLs automatically.
chat(messages) — Sends requests to the local server. All inference stays on your machine.

No data leaves your device. No API keys. No internet required after initial setup.

Status

console.log(sdk.status); // 0 = NotStarted, 1 = Degraded, 2 = Ready

Platform Support

| Platform | GPU Support | |----------|------------| | Windows x64 | NVIDIA (CUDA), AMD (HIP), Intel (SYCL), Vulkan, CPU | | macOS ARM64 | Apple Metal | | macOS x64 | CPU | | Linux x64 | NVIDIA (CUDA), AMD (ROCm), Vulkan, CPU |

License

Apache 2.0

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme