keyquill

v3.6.0

Published

2 days ago

Bring Your Own Key to any web app — without trusting their server. Browser extension SDK for secure LLM API key management.

0High
0Medium
0Low

r_okauchi

llm keyquill openai anthropic api-key browser-extension security byok

Keyquill

Bring Your Own Key to any web app — without trusting their server.

A browser extension + SDK that lets users securely use their own LLM API keys from any web application. Keys never leave the browser extension — no server relay needed.

The Problem

Web apps that use LLM APIs face a dilemma:

Server-side proxy: The app server sees the user's API key (security risk)
Direct browser calls: Blocked by CORS (LLM providers don't allow browser-origin requests)
Store key in localStorage: Vulnerable to XSS attacks

The Solution

Your Web App                    Keyquill Extension                  LLM Provider
┌──────────┐                    ┌─────────────────────┐             ┌──────────┐
│          │  chrome.runtime    │                     │   fetch()   │          │
│  SDK     │───────────────────>│  Service Worker     │────────────>│  OpenAI  │
│          │  (messages only)   │  + Key Storage      │  (no CORS)  │ Anthropic│
│          │<───────────────────│                     │<────────────│  Gemini  │
│          │  Port (streaming)  │  chrome.storage     │   SSE       │  etc.    │
└──────────┘                    │  .session           │             └──────────┘
                                └─────────────────────┘
                                Keys stay HERE. Always.

Security properties:

API keys stored in chrome.storage.session — inaccessible to web page JavaScript
Keys cleared automatically when the browser closes
Extension service worker makes CORS-free calls to LLM providers
Per-origin consent (MetaMask-style): first use from each origin requires explicit user approval via a consent popup. Grants are stored in chrome.storage.local and can be revoked from the extension popup.
Key management (registerKey / deleteKey) is restricted to the extension popup — web pages cannot register or delete keys
Web app never sees the key — only sends messages and receives streamed text

Quick Start

1. Install the SDK

npm install keyquill

2. Use in your app

keyquill@3 (current major) uses a capability-first API — the app declares what it needs, the user's policy picks the actual model. Three ergonomic tiers:

import { Keyquill } from "keyquill";

const quill = new Keyquill();

if (await quill.isAvailable()) {
  if (!(await quill.isConnected())) {
    try {
      await quill.connect();
    } catch (err) {
      // USER_DENIED / TIMEOUT — handle gracefully
      return;
    }
  }

  // ── Tier 1: zero-config ──────────────────────────────
  // Uses the key's default model. Simplest possible chat.
  const { completion } = await quill.chat({
    messages: [{ role: "user", content: "Hello!" }],
  });
  console.log(completion.content);

  // ── Tier 2: capability-declared (recommended) ────────
  // The broker picks the best model in the user's allowlist that
  // satisfies every capability. `tone` abstracts over temperature.
  for await (const event of quill.chatStream({
    messages: [{ role: "user", content: "Debug this code..." }],
    requires: ["reasoning", "long_context"],
    tone: "precise",
    maxOutput: 2048,
  })) {
    if (event.type === "delta") process.stdout.write(event.text);
  }

  // Tool calling — `tool_use` is implied by passing `tools`.
  const { completion: res } = await quill.chat({
    messages: [{ role: "user", content: "What's the weather in Tokyo?" }],
    tools: [
      {
        type: "function",
        function: {
          name: "get_weather",
          description: "Get current weather",
          parameters: {
            type: "object",
            properties: { location: { type: "string" } },
            required: ["location"],
          },
        },
      },
    ],
  });
  if (res.tool_calls) {
    console.log(res.tool_calls[0].function.name); // "get_weather"
  }

  // ── Tier 3: full control ─────────────────────────────
  // Pin the exact model + parameters.
  const { completion: pro } = await quill.chat({
    messages: [{ role: "user", content: "Prove the central limit theorem." }],
    prefer: {
      model: "gpt-5.4-pro",
      reasoningEffort: "high",
      temperature: 1, // reasoning models require 1
    },
  });

  // Vision (multimodal) — `vision` is implied by an image ContentPart.
  for await (const event of quill.chatStream({
    messages: [
      {
        role: "user",
        content: [
          { type: "text", text: "What is in this image?" },
          { type: "image_url", image_url: { url: "data:image/png;base64,..." } },
        ],
      },
    ],
    requires: ["vision"],
  })) {
    if (event.type === "delta") process.stdout.write(event.text);
  }

  // ── Preview: dry-run before committing ───
  // Tells you which model would run, the estimated cost, and whether a
  // consent prompt would fire — without actually sending the request.
  const plan = await quill.preview({
    messages: [{ role: "user", content: "Long analysis prompt…" }],
    requires: ["reasoning"],
    tone: "precise",
  });
  if (plan.kind === "ready") {
    console.log(`~$${plan.estimatedCostUSD.toFixed(4)} with ${plan.model.displayName}`);
  }
}

3. Install the extension

Load the extension from packages/keyquill-extension/dist/ in Chrome:

Go to chrome://extensions/
Enable "Developer mode"
Click "Load unpacked" and select the dist folder

API Reference

`new Keyquill(options?)`

| Option | Type | Default | Description | | ------------- | -------- | ----------- | ----------------------------------------- | | extensionId | string | auto-detect | Chrome extension ID | | timeout | number | 5000 | Timeout for non-streaming operations (ms) |

`quill.isAvailable(): Promise<boolean>`

Check if the extension is installed and responsive. Result is cached for 30 seconds.

`quill.isConnected(): Promise<boolean>`

Check whether the current origin has an active consent grant. Call this before deciding whether to show a "Connect" button.

`quill.connect(timeoutMs?: number): Promise<void>`

Request permission for the current origin. Opens a consent popup the first time (60 second default timeout for user interaction). Resolves when the user approves, throws with USER_DENIED or TIMEOUT otherwise. Subsequent calls return instantly if the grant is already present.

try {
  await quill.connect();
} catch (err) {
  // User denied or timed out — show a "Try again" prompt
}

`quill.disconnect(): Promise<void>`

Revoke the current origin's grant. The user will be prompted again on the next connect() call.

`quill.listProviders(): Promise<ProviderSummary[]>`

List registered providers. No key material is returned — only hints like sk-t...st12. Requires an active connection (call connect() first).

`quill.registerKey(provider, params): Promise<void>`

Popup-only. Calling this from a web page throws BLOCKED. API key material must not travel through the web-page channel. Users register keys via the extension popup (toolbar icon).

`quill.deleteKey(provider): Promise<void>`

Popup-only. Same rationale as registerKey: a compromised origin should not be able to wipe user keys. Users delete via the extension popup.

`quill.testKey(provider): Promise<{ reachable: boolean }>`

Test connectivity to a provider.

`quill.chat(params): Promise<{ completion; keyId }>`

Non-streaming chat completion. Returns the full response plus the keyId that serviced it.

`quill.chatStream(params): AsyncGenerator<StreamEvent>`

Stream a chat completion. First event is always { type: "start", keyId, provider, label } so callers can tell which key serviced the request.

type StreamEvent =
  | { type: "start"; keyId: string; provider: string; label: string }
  | { type: "delta"; text: string }
  | { type: "tool_call_delta"; tool_calls: ToolCallDelta[] }
  | {
      type: "done";
      finish_reason?: string;
      usage?: { promptTokens: number; completionTokens: number };
    }
  | { type: "error"; code: string; message: string };

`quill.preview(params): Promise<PlanPreview>`

Dry-run the resolver. Returns the model that would service the request, its estimated cost, and whether a consent prompt or policy rejection would fire — without issuing a provider fetch or opening any popup. Useful for pre-flight UX: cost previews, "this will need approval" hints, or capability-fallback logic.

const plan = await quill.preview({
  messages: [{ role: "user", content: "Prove Fermat's Last Theorem." }],
  requires: ["reasoning", "long_context"],
  tone: "precise",
});

if (plan.kind === "ready") {
  console.log(
    `Would use ${plan.model.displayName} (${plan.model.releaseStage}); ` +
      `~$${plan.estimatedCostUSD.toFixed(4)}, ` +
      `${plan.estimatedTokens.output} output tokens.`,
  );
} else if (plan.kind === "consent-required") {
  console.log(`Heads up — ${plan.message}`);
} else {
  console.error(`Would be rejected: ${plan.message}`);
}

type PlanPreview =
  | {
      kind: "ready";
      keyId: string;
      provider: string;
      model: {
        id: string;
        displayName: string;
        capabilities: readonly Capability[];
        releaseStage: "stable" | "preview" | "deprecated";
      };
      estimatedCostUSD: number;
      estimatedTokens: { input: number; output: number };
      selectionReason:
        | "default"
        | "explicit"
        | "capability-match"
        | "preferred-per-capability";
    }
  | {
      kind: "consent-required";
      reason:
        | "model-outside-allowlist"
        | "model-in-denylist"
        | "high-cost"
        | "capability-missing";
      message: string;
      proposedModel?: { id: string; displayName: string };
    }
  | { kind: "rejected"; reason: string; message: string };

Preview requires an active connection — a compromised origin cannot probe the user's policy without prior consent.

ChatParams (shared by `chat` and `chatStream`)

interface ChatParams {
  messages: ChatMessage[];     // conversation (text / vision / tool results)
  tools?: Tool[];              // function calling
  toolChoice?: ToolChoice;     // "none" | "auto" | "required" | specific
  responseFormat?: ResponseFormat; // "text" | "json_object" | "json_schema"

  // Capability-first fields
  requires?: Capability[];     // capabilities the broker must satisfy
  tone?: "precise" | "balanced" | "creative";
  maxOutput?: number;          // max output tokens (clamped by policy)
  prefer?: {
    model?: string;            // Tier 3 explicit model pin
    provider?: string;         // narrow to a specific provider
    temperature?: number;
    topP?: number;
    reasoningEffort?: "minimal" | "low" | "medium" | "high";
  };

  keyId?: string;              // explicit key selection
}

type Capability =
  | "tool_use" | "structured_output" | "vision" | "audio"
  | "reasoning" | "long_context" | "streaming" | "cache"
  | "fast" | "cheap" | "multilingual" | "code";

Migration history

From the legacy `[email protected]` (snake_case) API

Pin it if you're not ready to migrate — it still works against the installed extension via an internal wire translator. When you migrate to @3, rewrite the top-level fields per this table:

| @0.3.x (legacy) | @3 (current) | | --- | --- | | model: "gpt-4o" | prefer: { model: "gpt-4o" } | | temperature: 0.7 | prefer: { temperature: 0.7 } — or tone: "balanced" | | top_p: 0.9 | prefer: { topP: 0.9 } | | max_tokens: 2048 | maxOutput: 2048 | | max_completion_tokens: 2048 | maxOutput: 2048 | | reasoning_effort: "high" | prefer: { reasoningEffort: "high" } | | tool_choice: "required" | toolChoice: "required" | | response_format: { type: "json_object" } | responseFormat: { type: "json_object" } | | provider: "openai" | prefer: { provider: "openai" } | | stop: [...] | (removed — not commonly used, re-request if needed) |

Between majors since `@1`

If you're coming from @1.x or @2.x, the ChatParams shape is identical — only a few type names on the SDK surface changed:

| Deprecated name | Replacement | Removed in | | --- | --- | --- | | VaultRequest | KeyquillRequest | @2 | | VaultResponse | KeyquillResponse | @2 | | KeySummary.defaultModel | KeySummary.effectiveDefaultModel | @3 | | KeySummary.defaults | KeySummary.policy | @3 | | KeySummary.isActive | (removed — per-origin bindings drive routing) | @3 |

# Install
npm install keyquill@3       # current
npm install [email protected]   # legacy snake_case API (frozen)

Supported Providers

The wire protocol is OpenAI Chat Completions format. Any OpenAI-compatible provider works out of the box.

| Provider | Base URL | Type | | ----------- | --------------------------------------------------------- | ---------- | | OpenAI | https://api.openai.com/v1 | Native | | Anthropic | https://api.anthropic.com/v1 | Translated | | Gemini | https://generativelanguage.googleapis.com/v1beta/openai | Compatible | | Groq | https://api.groq.com/openai/v1 | Compatible | | Mistral | https://api.mistral.ai/v1 | Compatible | | DeepSeek | https://api.deepseek.com/v1 | Compatible | | Together AI | https://api.together.xyz/v1 | Compatible | | Fireworks | https://api.fireworks.ai/inference/v1 | Compatible | | xAI (Grok) | https://api.x.ai/v1 | Compatible | | Ollama | http://localhost:11434/v1 | Compatible |

Anthropic is the only provider requiring translation (OpenAI format → Messages API). All others receive requests as-is.

Comparison with Alternatives

| Approach | Key Location | XSS Safe | Server Trust | CORS | | -------------- | --------------------- | -------- | -------------- | ------------ | | Server proxy | Server memory | Yes | Required | N/A | | localStorage | Browser JS | No | N/A | Blocked | | sessionStorage | Browser JS | No | N/A | Blocked | | Keyquill | Extension storage | Yes | Not needed | Bypassed |

Framework Support

Zero dependencies. Works with any framework:

React / Next.js
Vue / Nuxt
Svelte / SvelteKit
Preact
Vanilla JavaScript/TypeScript

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Keyquill

The Problem

The Solution

Quick Start

1. Install the SDK

2. Use in your app

3. Install the extension

API Reference

new Keyquill(options?)

quill.isAvailable(): Promise<boolean>

quill.isConnected(): Promise<boolean>

quill.connect(timeoutMs?: number): Promise<void>

quill.disconnect(): Promise<void>

quill.listProviders(): Promise<ProviderSummary[]>

quill.registerKey(provider, params): Promise<void>

quill.deleteKey(provider): Promise<void>

quill.testKey(provider): Promise<{ reachable: boolean }>

quill.chat(params): Promise<{ completion; keyId }>

quill.chatStream(params): AsyncGenerator<StreamEvent>

quill.preview(params): Promise<PlanPreview>

ChatParams (shared by chat and chatStream)