@sixdayswest/memq

v1.2.0

Published

14 days ago

Compress AI memory tokens 50-95%. Works with any memory system (Claude, OpenClaw, ChatGPT, Google AOM). Zero LLM cost. Selective recall saves 90%+ context.

0High
0Medium
0Low

sixdayswest

memory token compression llm ai openclaw claude chatgpt google-aom context-window cost-savings deduplication recall agent middleware

MemQ — AI Memory Token Compression

Compresses AI memory content into compact LLM-parseable formats, reducing token usage by 50-95%. Works with any memory system. Zero LLM cost.

The Problem

AI agents accumulate memory over time. That memory gets loaded as context on every query, and you pay for every token. A 50K-token memory file at 200 queries/day on Claude Sonnet costs ~$900/month in input tokens alone. As memory grows, costs grow linearly.

What MemQ Does

Three functions. No server, no external calls, no LLM cost.

npm install @sixdayswest/memq

import { recall, compress, bulkOptimize } from '@sixdayswest/memq';

`recall(query, memory)` — Load only what's relevant

The high-value function. Instead of loading your entire memory file as context, recall figures out which sections matter for the current query and returns only those.

import { recall } from '@sixdayswest/memq';
import { readFileSync } from 'fs';

const memory = readFileSync('MEMORY.md', 'utf-8');
const result = recall('what database do we use?', memory);

// result.recalled = "# Tech Stack\n- PostgreSQL is primary database\n- Redis for caching"
// result.tokens_returned = 45    (instead of 12,000)
// result.tokens_saved = 11,955

Parameters:

query (string) — the user's message or topic to match against
memoryContent (string) — the full memory file contents
maxTokens (number, default: 2000) — token budget for the response
options.format — memory format: 'openclaw-md', 'claude-md', 'chatgpt', 'google-aom', 'generic'
options.maxSections (number, default: 10) — max sections to return
options.minRelevance (number, default: 0.1) — minimum relevance score (0-1)

`compress(content)` — Structural compression

Reduces verbose memory text by 40-65%. No LLM calls, deterministic output.

import { compress } from '@sixdayswest/memq';

const result = compress(memoryContent);

// result.reduction_pct = 58.2
// result.original_tokens = 1394
// result.compressed_tokens = 583

`bulkOptimize(content)` — Deep cleanup with deduplication

Combines compression with deduplication. Catches months of accumulated redundancy — paraphrased duplicates, not just exact matches.

import { bulkOptimize } from '@sixdayswest/memq';

const result = bulkOptimize(largeMemoryFile);

// result.reduction_pct = 94.7  (on a real 9,626-token file)
// result.duplicates_removed = 460
// result.sections_found = 16

Supported Memory Formats

OpenClaw MEMORY.md — 'openclaw-md'
Claude CLAUDE.md — 'claude-md'
ChatGPT memories — 'chatgpt'
Google Always-On Memory agent records — 'google-aom'
Any markdown — 'generic'

Integration Examples

OpenClaw Context Engine

import { recall } from '@sixdayswest/memq';

export async function onContextLoad(query, memoryContent) {
  const { recalled } = recall(query, memoryContent, 2000);
  return recalled;
}

Claude Code / CLAUDE.md

import { recall, bulkOptimize } from '@sixdayswest/memq';
import { readFileSync, writeFileSync } from 'fs';

// At session start — selective recall
const memory = readFileSync('CLAUDE.md', 'utf-8');
const relevant = recall(userMessage, memory, 3000, { format: 'claude-md' });

// Monthly cleanup
const optimized = bulkOptimize(memory, 'claude-md');
writeFileSync('CLAUDE.compressed.md', optimized.optimized);

Any Agent Framework

import { recall } from '@sixdayswest/memq';

// Works with LangChain, CrewAI, AutoGen, or custom agents
const memory = loadMemoryFromAnywhere();
const { recalled, tokens_saved } = recall(currentTask, memory);

Performance

recall: <50ms for memory files under 100K tokens
compress: <50ms for inputs under 50K tokens
bulkOptimize: <2s for inputs under 100K tokens
Handles inputs up to 200K+ tokens
Zero external network calls

Cost Savings

A user with 50K tokens of memory making 200 queries/day on Claude Sonnet:

Without MemQ: ~$900/month in input tokens
With MemQ recall: ~$45/month (95% reduction)
With compression + recall: ~$25/month

Savings scale with memory size and query volume.

Pricing

Free: 30 calls/day
Pro ($4.99/mo): 90 calls/day
Plus ($9.99/mo): 200 calls/day
Team ($14.99/seat/mo): Unlimited

Set MEMQ_LICENSE_KEY in your environment to activate your plan.

API Reference

RecallResult

interface RecallResult {
  recalled: string;
  query: string;
  sections_scanned: number;
  sections_returned: number;
  items_scanned: number;
  items_returned: number;
  tokens_returned: number;
  tokens_saved: number;
  relevance_scores: { section: string; score: number; items_matched: number }[];
}

CompressResult

interface CompressResult {
  compressed: string;
  original_tokens: number;
  compressed_tokens: number;
  reduction_pct: number;
}

BulkOptimizeResult

interface BulkOptimizeResult {
  optimized: string;
  tokens_before: number;
  tokens_after: number;
  reduction_pct: number;
  duplicates_removed: number;
  sections_found: number;
}

HTTP API

MemQ also runs as a hosted API. See the hosted API on RapidAPI.

License

Business Source License 1.1 (BUSL-1.1). Free to use. Cannot be offered as a competing hosted service. See LICENSE for details.