npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

aggo-ai

v1.0.0

Published

AI plugin for aggo.js - Natural language to MongoDB pipeline conversion using OpenAI

Readme

@aggo/plugin-ai

🤖 AI-powered natural language to MongoDB pipeline conversion for aggo.js

Convert natural language queries into MongoDB aggregation pipelines using OpenAI, with automatic schema inference and optimized execution via aggo.

Installation

npm install @aggo/plugin-ai

Requirements:

  • Node.js 18+
  • OpenAI API key
  • aggo as a peer dependency

Quick Start

Environment Setup

export OPENAI_API_KEY="your-openai-api-key"

CLI Usage

# Basic natural language query
cat sales.jsonl | npx aggo-ai "total revenue by product category"

# Show inferred schema
cat data.jsonl | npx aggo-ai --schema-only

# Generate pipeline without executing
npx aggo-ai "average rating by genre" --file movies.jsonl --show-pipeline

# Use specific OpenAI model
cat logs.jsonl | npx aggo-ai "error count by service" --model gpt-4

# Get detailed explanation
npx aggo-ai "top 10 customers by order value" --file orders.jsonl --explain

Programmatic Usage

import { aiQuery, getSchema, generatePipeline } from '@aggo/plugin-ai';

const data = [
  { name: 'Alice', age: 30, department: 'Engineering', salary: 95000 },
  { name: 'Bob', age: 25, department: 'Marketing', salary: 75000 },
  { name: 'Carol', age: 35, department: 'Engineering', salary: 110000 },
];

// Execute natural language query
const result = await aiQuery(data, 'average salary by department');
console.log(result.results);
// Output: [
//   { _id: 'Engineering', avgSalary: 102500 },
//   { _id: 'Marketing', avgSalary: 75000 }
// ]

// Get schema information
const schema = getSchema(data);
console.log(schema);
// Output: {
//   name: 'string',
//   age: 'integer',
//   department: 'string',
//   salary: 'integer'
// }

// Generate pipeline only
const pipeline = await generatePipeline(
  'count employees by department',
  schema
);
console.log(pipeline.pipeline);
// Output: [
//   { $group: { _id: '$department', count: { $sum: 1 } } }
// ]

CLI Options

Usage: aggo-ai [query] [options]

Options:
  -f, --file <path>         Read data from file instead of stdin
  --schema-only            Show inferred schema without querying
  --show-pipeline          Print generated pipeline but don't run it
  --limit-sample <n>       Control rows sampled for schema inference
  --model <model>          Override default OpenAI model
  --explain                Include explanation of the generated pipeline
  --pretty                 Pretty-print JSON output
  --api-key <key>          OpenAI API key (or use OPENAI_API_KEY env var)
  -h, --help               Show help message

API Reference

aiQuery(documents, query, options)

Executes a natural language query against documents.

Parameters:

  • documents: Array of documents to query
  • query: Natural language query string
  • options: Configuration options

Returns: AIQueryResult with pipeline, results, schema, and performance metrics

getSchema(documents, options)

Infers schema from documents without executing a query.

Parameters:

  • documents: Array of documents to analyze
  • options: Schema inference options

Returns: Simplified schema object

generatePipeline(query, schema, samples, options)

Generates MongoDB pipeline from natural language without executing.

Parameters:

  • query: Natural language query
  • schema: Data schema or documents to infer from
  • samples: Sample documents for context
  • options: OpenAI configuration

Returns: Generated pipeline with metadata

Natural Language Examples

| Query | Generated Pipeline | | ---------------------------------------- | --------------------------------------------------------------------------------------------- | | "sum revenue" | [{"$group": {"_id": null, "total": {"$sum": "$revenue"}}}] | | "average score by category" | [{"$group": {"_id": "$category", "avg": {"$avg": "$score"}}}] | | "top 5 users by points" | [{"$sort": {"points": -1}}, {"$limit": 5}] | | "count orders where status is shipped" | [{"$match": {"status": "shipped"}}, {"$count": "total"}] | | "revenue by month for 2023" | [{"$match": {"year": 2023}}, {"$group": {"_id": "$month", "revenue": {"$sum": "$amount"}}}] |

Configuration

OpenAI Models

Supported models (default: gpt-4-turbo-preview):

  • gpt-4-turbo-preview - Best accuracy, higher cost
  • gpt-4 - Good balance of accuracy and cost
  • gpt-3.5-turbo - Faster, lower cost, less accurate

Schema Inference

Options for getSchema() and aiQuery():

  • sampleSize: Number of documents to sample (default: 100)
  • maxDepth: Maximum depth for nested objects (default: 5)

Performance

The plugin provides detailed performance metrics:

const result = await aiQuery(data, 'query');
console.log(result.performance);
// {
//   schemaInferenceMs: 15,
//   pipelineGenerationMs: 850,
//   executionMs: 5,
//   totalMs: 870
// }

Error Handling

Common errors and solutions:

Missing API Key

❌ Error: OpenAI API key is required
💡 Set OPENAI_API_KEY environment variable or use --api-key option

Invalid Query

❌ Error: Unable to generate valid pipeline
💡 Try rephrasing your query or check the schema with --schema-only

API Limits

❌ Error: OpenAI API quota exceeded
💡 Check your OpenAI account usage and billing

Advanced Usage

Custom Model Configuration

const result = await aiQuery(data, 'query', {
  model: 'gpt-3.5-turbo',
  temperature: 0.2,
  maxTokens: 800,
});

Schema-First Development

// First, understand your data
const schema = getSchema(documents);
console.log('Available fields:', Object.keys(schema));

// Then craft specific queries
const result = await aiQuery(documents, 'average price by category');

Batch Processing

const queries = ['total sales', 'average order value', 'top selling products'];

const results = await Promise.all(
  queries.map(query => aiQuery(documents, query))
);

Contributing

See CONTRIBUTING_PLUGINS.md for plugin development guidelines.

License

MIT