glide-mq
v0.15.1
Published
High-performance message queue for Node.js with AI-native primitives - built on Valkey/Redis with Rust NAPI bindings
Maintainers
Readme
glide-mq
High-performance message queue for Node.js with first-class AI orchestration. Built on Valkey/Redis Streams with a Rust NAPI core.
Completes and fetches the next job in a single server-side function call (1 RTT per job), hash-tags every key for zero-config clustering, and ships AI-native primitives for cost tracking, token streaming, human-in-the-loop suspend/resume, model failover, TPM rate limiting, budget caps, rolling usage summaries, and vector search.
npm install glide-mqGeneral Usage
import { Queue, Worker } from 'glide-mq';
const connection = { addresses: [{ host: 'localhost', port: 6379 }] };
const queue = new Queue('tasks', { connection });
await queue.add('send-email', { to: '[email protected]', subject: 'Welcome' });
const worker = new Worker(
'tasks',
async (job) => {
await sendEmail(job.data.to, job.data.subject);
return { sent: true };
},
{ connection, concurrency: 10 },
);AI Usage
import { Queue, Worker } from 'glide-mq';
const queue = new Queue('ai', { connection });
await queue.add(
'inference',
{ prompt: 'Explain message queues' },
{
fallbacks: [{ model: 'gpt-5.4-nano', provider: 'openai' }],
lockDuration: 120000,
},
);
const worker = new Worker(
'ai',
async (job) => {
const result = await callLLM(job.data.prompt);
await job.reportUsage({
model: 'gpt-5.4',
tokens: { input: 50, output: 200 },
costs: { total: 0.003 },
});
await job.stream({ type: 'token', content: result });
return result;
},
{ connection, tokenLimiter: { maxTokens: 100000, duration: 60000 } },
);When to use glide-mq
- Background jobs and task processing - email, image processing, data pipelines, webhooks, any async work.
- Scheduled and recurring work - cron jobs, interval tasks, bounded schedulers.
- Distributed workflows - parent-child trees, DAGs, fan-in/fan-out, step jobs, dynamic children.
- High-throughput queues over real networks - 1 RTT per job via Valkey Server Functions, up to 38% faster than alternatives.
- LLM pipelines and model orchestration - cost tracking, token streaming, model failover, budget caps without external middleware.
- Valkey/Redis clusters - hash-tagged keys out of the box with zero configuration.
How it's different
| Aspect | glide-mq |
| ------------------- | --------------------------------------------------------------------------------------------------------- |
| Network per job | 1 RTT - complete + fetch next in a single FCALL |
| Client | Rust NAPI bindings via valkey-glide - no JS protocol parsing |
| Server logic | Persistent Valkey Function library (FUNCTION LOAD + FCALL) - no per-call EVAL |
| Cluster | Hash-tagged keys (glide:{queueName}:*) route to the same slot automatically |
| AI-native | Cost tracking, token streaming, suspend/resume, fallback chains, TPM limits, budget caps, usage summaries |
| Vector search | KNN similarity queries over job data via Valkey Search |
AI-native primitives
Seven primitives for LLM and agent workflows, built into the core API.
- Cost tracking -
job.reportUsage()records model, tokens, cost, latency per job.queue.getFlowUsage()aggregates across flows andqueue.getUsageSummary()rolls usage up across queues. - Token streaming -
job.stream(chunk)pushes LLM output tokens in real time.queue.readStream(jobId)consumes them with optional long-polling. - Suspend/resume -
job.suspend()pauses mid-processor for human approval or webhook callback.queue.signal(jobId, name, data)resumes with external input. - Fallback chains - ordered
fallbacksarray on job options. On failure, the next retry readsjob.currentFallbackfor the alternate model/provider. - TPM rate limiting -
tokenLimiteron worker options enforces tokens-per-minute caps. Combine with RPMlimiterfor dual-axis rate control. - Budget caps -
FlowProducer.add(flow, { budget })setsmaxTotalTokensandmaxTotalCostacross all jobs in a flow. Jobs fail or pause when exceeded. - Per-job lock duration - override
lockDurationper job for adaptive stall detection. Short for classifiers, long for multi-minute LLM calls.
See Usage - AI-native primitives for full examples.
Features
- 1 RTT per job - complete current + fetch next in a single server-side function call
- Cluster-native - hash-tagged keys, zero cluster configuration
- Workflows - FlowProducer trees, DAGs with fan-in, chain/group/chord, step jobs, dynamic children
- Scheduling - 5-field cron with timezone, fixed intervals, bounded schedulers
- Retries - exponential, fixed, or custom backoff with dead-letter queues
- Rate limiting - per-group sliding window, token bucket, global queue-wide limits
- Broadcast - fan-out pub/sub with NATS-style subject filtering and independent subscriber retries
- Batch processing - process multiple jobs at once for bulk I/O
- Request-reply -
queue.addAndWait()for synchronous RPC patterns - Deduplication - simple, throttle, and debounce modes
- Compression - transparent gzip at the queue level
- Serverless - lightweight
ProducerandServerlessPoolfor Lambda/Edge - OpenTelemetry - automatic span emission with bring-your-own tracer
- In-memory testing -
TestQueueandTestWorkerwith zero Valkey dependency - Cross-language - HTTP proxy with request-reply, SSE streams, schedulers, flow create/read/tree/delete, usage summaries, and broadcast fan-out for non-Node.js services
Performance
Benchmarked on AWS ElastiCache Valkey 8.2 (r7g.large) with TLS, EC2 client in the same region.
| Concurrency | glide-mq | Leading Alternative | Delta | | :---------: | ---------: | ------------------: | :---: | | c=5 | 10,754 j/s | 9,866 j/s | +9% | | c=10 | 18,218 j/s | 13,541 j/s | +35% | | c=15 | 19,583 j/s | 14,162 j/s | +38% | | c=20 | 19,408 j/s | 16,085 j/s | +21% |
The advantage comes from completing and fetching the next job in a single FCALL. The savings compound over real network latency - exactly the conditions in every production deployment. At high concurrency both libraries converge toward the Valkey single-thread ceiling.
Reproduce with npm run bench or npx tsx benchmarks/elasticache-head-to-head.ts against your own infrastructure.
Examples
All examples live in glidemq-examples. Pick a directory, npm install, npx tsx <file>.ts.
| AI Orchestration | Core Patterns | Framework Plugins | |------------------|---------------|-------------------| | Usage & Costs | Basics | Hono | | Budget & TPM | Workflows | Fastify | | Streaming | Advanced | Hapi | | Agent Loops | DAG Flows | NestJS | | Search & Vectors | Scheduling | Express | | SDK Integrations | Batch Processing | Next.js |
When NOT to use glide-mq
- You need a log-based event streaming platform. glide-mq is a job/task queue, not a partitioned event log. It does not provide Kafka-style topic partitions, consumer offset management, or event replay.
- You need browser support. The Rust NAPI client requires a server-side runtime (Node.js 20+, Bun, or Deno with NAPI support).
- You need exactly-once semantics. glide-mq provides at-least-once delivery. Duplicate processing is rare but possible - design processors to be idempotent.
- You need to run without Valkey or Redis. Production use requires Valkey 7.0+ or Redis 7.0+. For dev/testing,
TestQueue/TestWorkerrun fully in-memory.
Documentation
| Guide | Topics | | -------------------------------------- | ----------------------------------------------------------- | | Usage | Queue, Worker, Producer, request-reply, SSE, proxy, flow HTTP API, usage summaries | | Workflows | FlowProducer, DAG, chain/group/chord, dynamic children | | Advanced | Schedulers, rate limiting, dedup, compression, retries, DLQ | | Broadcast | Pub/sub fan-out, subject filtering | | Observability | OpenTelemetry, metrics, job logs, dashboard | | Serverless | Producer, ServerlessPool, Lambda/Edge, HTTP proxy | | Testing | In-memory TestQueue and TestWorker | | Wire Protocol | Cross-language FCALL specs, Python/Go examples | | Step Jobs | Step-job workflows with moveToDelayed | | Durability | Durability guarantees, persistence, delivery semantics | | Architecture | Internal architecture and design reference | | Migration | API mapping guide for migrating from other queues |
Ecosystem
| Package | Description | | -------------------------------------------------------------------- | --------------------------------------------- | | @glidemq/speedkey | Valkey GLIDE client with native NAPI bindings | | @glidemq/dashboard | Web UI for metrics, schedulers, job mutations | | @glidemq/hono | Hono middleware | | @glidemq/fastify | Fastify plugin | | @glidemq/nestjs | NestJS module | | @glidemq/hapi | Hapi plugin | | glidemq.dev | Full documentation site |
Contributing
Bug reports, feature requests, and pull requests are welcome.
License
Apache-2.0
