npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

cyber-mysql-openai

v0.3.2

Published

Intelligent natural language to SQL translator with self-correction capabilities using OpenAI and MySQL

Downloads

756

Readme

Cyber-MySQL-OpenAI

Intelligent natural language to SQL translator for Node.js

Cyber-MySQL-OpenAI is a powerful Node.js library that translates natural language queries into valid SQL, executes them against MySQL databases, and returns results accompanied by human-readable explanations — all powered by OpenAI.

Spanish documentation / Documentación en español


Table of Contents


Features

  • Natural language to SQL translation — Converts plain-text questions into valid SQL queries
  • Automatic execution — Runs generated queries directly against your MySQL database
  • Self-correcting error handling — Detects and autonomously corrects failed queries (up to 3 reflection attempts)
  • Natural language explanations — Translates technical query results into user-friendly responses
  • Structured output with function calling — Uses OpenAI function calling for predictable JSON responses with automatic text fallback
  • Business context awareness — Enrich the AI with domain-specific knowledge about your tables and columns
  • Foreign key relationship detection — Automatically discovers and uses FK relationships for accurate JOINs
  • Few-shot examples — Provide reference question/SQL pairs to guide the model
  • Confidence scoring — Each query includes a confidence score indicating how well it answers the question
  • Multi-language support — English and Spanish with dynamic runtime switching
  • In-memory cache — Optional high-performance caching layer with variable TTL and automatic cleanup
  • Full TypeScript support — Complete type definitions for a seamless developer experience
  • Highly configurable — Adjust logging, caching, language, and model settings to fit your needs
  • Enhanced Intelligence (v0.3.0) — Improved prompts with SQL optimization rules, aliases, and better disambiguation
  • Custom Instructions — Inject your own business rules and response styles into the AI
  • Query Validation Layer — Automatically checks generated SQL for safety, table existence, and cartesian products
  • Schema Caching — Configurable TTL to reduce database load and improve latency
  • Token Usage Tracking — Detailed token counts and estimated cost per query
  • Query History — In-memory log of executed queries with performance statistics
  • Advanced logging — Structured logging system with token usage tracking and prompt/response audit trails
  • Token Optimization (v0.3.2) — Compressed schema format, lightModel for subtasks, and filtered schema in reflections — 50–70% fewer tokens per request

Installation

npm install cyber-mysql-openai

System Requirements

| Requirement | Details | | ------------ | ---------------------------------------------------- | | Node.js | v16.x or higher (developed and tested with v22.15.0) | | Database | MySQL or MariaDB | | API Key | A valid OpenAI API key |


Basic Usage

import { CyberMySQLOpenAI } from "cyber-mysql-openai";
import "dotenv/config";

const translator = new CyberMySQLOpenAI({
  database: {
    host: process.env.DB_HOST || "localhost",
    port: 3306,
    user: process.env.DB_USER || "",
    password: process.env.DB_PASSWORD || "",
    database: process.env.DB_DATABASE || "",
    ssl: false,
  },
  openai: {
    apiKey: process.env.OPENAI_API_KEY || "",
    model: "gpt-4",
  },
  language: "en",
});

async function main() {
  try {
    const result = await translator.query(
      "What was the best-selling product last month?",
    );

    console.log("Generated SQL:", result.sql);
    console.log("Results:", result.results);
    console.log("Explanation:", result.naturalResponse);

    await translator.close();
  } catch (error) {
    console.error("Error:", error);
  }
}

main();

Intelligence Features

New in v0.2.0

Cyber-MySQL-OpenAI can be configured with business context to dramatically improve the accuracy and relevance of generated SQL queries.

Business Context

Provide the AI with domain-specific knowledge about your database:

import { CyberMySQLOpenAI, SchemaContext } from "cyber-mysql-openai";

const context: SchemaContext = {
  businessDescription:
    "E-commerce platform for electronic products with orders, customers, and inventory",
  tables: {
    orders: {
      description: "Customer purchase orders with status tracking",
      columns: {
        status: "Order status: 'pending', 'shipped', 'delivered', 'cancelled'",
        total_amount: "Total in USD including taxes and shipping",
      },
    },
    products: {
      description: "Product catalog with pricing and stock levels",
      columns: {
        sku: "Unique product identifier used in warehouse systems",
        price: "Current retail price in USD (before discounts)",
      },
    },
  },
};

const translator = new CyberMySQLOpenAI({
  database: {
    /* ... */
  },
  openai: {
    /* ... */
  },
  context,
});
```

---

## Intelligence Features (v0.3.0)

### 1. Custom Instructions & Response Style

You can now inject specific business rules and control the personality of the AI:

```typescript
const translator = new CyberMySQLOpenAI({
  // ...
  context: {
    businessDescription: "E-commerce platform",
    // v0.3.0: Custom rules
    customInstructions: [
      "Always exclude soft-deleted records (deleted_at IS NOT NULL)",
      "When asked for 'revenue', use the sum of total_amount from orders table",
    ],
    // v0.3.0: Response style ('concise', 'detailed', 'technical')
    responseStyle: "concise",
  },
});

2. Query Validation

Every generated query is automatically validated before execution. The system checks for:

  • Dangerous operations: Only SELECT statements are allowed; UPDATE, DROP, etc., are blocked.
  • Schema validation: Checks for non-existent tables or columns (best-effort).
  • Cartesian products: Warns about potential cartesian products due to missing JOIN conditions.
  • Performance checks: Identifies missing LIMIT clauses on potentially large result sets.

3. Schema Caching

Reduce latency and database load by caching schema definitions:

const translator = new CyberMySQLOpenAI({
  // ...
  schemaTTL: 600000, // Cache schema for 10 minutes (default: 5 min)
});

// Force refresh if schema changes
translator.refreshSchema();

4. Query History & Stats

Track performance and usage during the session:

const stats = translator.getQueryStats();
console.log(stats);
// {
//   totalQueries: 10,
//   successfulQueries: 9,
//   averageExecutionTime: 450ms,
//   totalTokensUsed: 5200
// }

// Get last 5 queries
const history = translator.getQueryHistory(5);

5. Token Usage & Cost Estimation

Each result now includes detailed token usage and estimated cost:

const result = await translator.query("Sales count?");
console.log(result.tokenUsage);
// {
//   promptTokens: 500,
//   completionTokens: 50,
//   totalTokens: 550,
//   estimatedCost: 0.0015 // USD (based on current model pricing)
// }

Few-Shot Examples

Guide the model with reference question/SQL pairs specific to your domain:

const context: SchemaContext = {
  businessDescription: "E-commerce platform",
  tables: {
    /* ... */
  },
  examples: [
    {
      question: "What are this month's total sales?",
      sql: "SELECT SUM(total_amount) as total_sales FROM orders WHERE MONTH(created_at) = MONTH(CURRENT_DATE()) AND YEAR(created_at) = YEAR(CURRENT_DATE())",
    },
    {
      question: "Which products are low on stock?",
      sql: "SELECT name, stock FROM products WHERE stock < 10 ORDER BY stock ASC",
    },
  ],
};

Foreign Key Relationships

The library automatically detects FK relationships from your database's information_schema and includes them in the AI prompt. This enables the model to construct accurate JOINs without you having to describe table relationships manually.

Function Calling with Fallback

The library uses OpenAI's function calling feature for structured, predictable responses:

  • Primary mode (function calling): Returns { sql, confidence, reasoning } as structured JSON
  • Fallback mode (text): If the model does not support function calling, the library automatically falls back to text parsing with sqlCleaner

This dual-mode approach ensures compatibility across all OpenAI models (GPT-3.5, GPT-4, GPT-4o, etc.).

Confidence Score

Every query result includes an optional confidence field (0–1) when function calling is available:

const result = await translator.query(
  "What are the top 5 products by revenue?",
);

console.log(result.confidence); // 0.95
console.log(result.sql); // SELECT ...

Use confidence to implement logic like warning users when the model is unsure, or triggering a manual review for low-confidence queries.


Token Optimization (v0.3.2)

Three built-in optimizations reduce token consumption by 50–70% with zero behavioral changes.

1. Compressed Schema Format

The schema sent to OpenAI is now ultra-compact (~20 tokens/table vs ~60 before):

# Before
Tabla orders (Customer purchases): id (int, PRIMARY KEY), status (varchar), total_amount (decimal)...

# After
orders: id* status total_amount→customers

* = PRIMARY KEY, →table = FK reference. Business descriptions still appear when configured.

2. lightModel for Subtasks

Use a cheaper model for reflection and natural language formatting while keeping your main model for SQL generation:

const translator = new CyberMySQLOpenAI({
  openai: {
    apiKey: "...",
    model: "gpt-4o", // SQL generation (needs intelligence)
    lightModel: "gpt-4o-mini", // Reflection & formatting (20x cheaper, same quality) (v0.3.2)
  },
});

Or via environment variable: OPENAI_LIGHT_MODEL=gpt-4o-mini

Cost impact: gpt-4o-mini is ~20x cheaper than gpt-4o. Since ~40% of calls are subtasks, this alone reduces total cost by ~40–85%.

3. Filtered Schema in Reflections

When a query fails and needs correction, only the schema of the tables used in the failed SQL is sent to the LLM — not the entire database schema. In a 30-table database, this can reduce the reflection prompt by ~80%.


Configuration Options

const translator = new CyberMySQLOpenAI({
  // Database connection
  database: {
    host: "localhost",
    port: 3306,
    user: "username",
    password: "password",
    database: "my_database",
    ssl: false,
    socketPath: "/path/to/mysql.sock", // Optional
  },

  // OpenAI settings
  openai: {
    apiKey: "your_api_key",
    model: "gpt-4o", // Main model for SQL generation
    lightModel: "gpt-4o-mini", // Optional: cheaper model for reflection & formatting (v0.3.2)
  },

  // Cache settings (optional)
  cache: {
    enabled: true, // Enable/disable cache
    maxSize: 1000, // Maximum number of cache entries
    defaultTTL: 300000, // Default TTL in milliseconds (5 min)
    cleanupInterval: 300000, // Cleanup interval in milliseconds
  },

  // General settings
  maxReflections: 3, // Max correction attempts on SQL errors
  logLevel: "info", // 'error' | 'warn' | 'info' | 'debug' | 'none'
  logDirectory: "./logs", // Directory for log files
  logEnabled: true, // Set to false to disable all logging
  language: "en", // 'en' (English) or 'es' (Spanish)
});

Cache System

Cyber-MySQL-OpenAI includes an optional in-memory cache that significantly improves response times for repeated or similar queries.

How It Works

  • Query normalization — Queries are normalized before cache lookup to maximize hit rates
  • Variable TTL — Time-to-live is determined dynamically based on query type:
    • Schema/metadata queries: 1 hour
    • Aggregate queries (COUNT, SUM, AVG, GROUP BY): 15 minutes
    • Simple queries: 5 minutes
  • Automatic cleanup — Expired entries are periodically removed
  • Performance metrics — Real-time statistics including hit rate and memory usage

Basic Cache Usage

const translator = new CyberMySQLOpenAI({
  // ... database and OpenAI config
  cache: {
    enabled: true,
    maxSize: 1000,
    defaultTTL: 300000,
    cleanupInterval: 300000,
  },
});

const result1 = await translator.query("Show me all users"); // Hits the database
const result2 = await translator.query("Show me all users"); // Returns from cache

console.log("From cache:", result2.fromCache); // true
console.log("Execution time:", result2.executionTime); // Significantly faster

Cache Management

// Retrieve cache performance statistics
const stats = translator.getCacheStats();
console.log("Hit rate:", stats.hitRate);
console.log("Entries:", stats.totalEntries);

// Clear all cached entries
translator.clearCache();

// Toggle cache at runtime
translator.disableCache();
translator.enableCache();
console.log("Cache active:", translator.isCacheEnabled());

Best Practices for API Integration

Use a shared global instance to maximize cache effectiveness across requests:

// api-instance.ts
import { CyberMySQLOpenAI } from "cyber-mysql-openai";

export const translator = new CyberMySQLOpenAI({
  // ... configuration
  cache: { enabled: true, maxSize: 2000 },
});

// api-routes.ts
import { translator } from "./api-instance";

app.get("/query", async (req, res) => {
  const result = await translator.query(req.body.question);
  res.json({
    ...result,
    cached: result.fromCache,
    responseTime: result.executionTime,
  });
});

Note: The cache is shared across all requests and users. For user-specific data, implement appropriate cache invalidation strategies.

For additional examples, see docs/cache-examples.md.


Multi-Language Support

The library supports English and Spanish for all responses, error messages, and OpenAI prompts. The language can be configured at initialization or changed dynamically at runtime.

Configuration

// Set during initialization
const translator = new CyberMySQLOpenAI({
  // ... other configuration
  language: "en", // 'en' for English, 'es' for Spanish
});

// Change at runtime
translator.setLanguage("es");
console.log("Current language:", translator.getLanguage());

What Gets Localized

  • Error messages
  • Prompts sent to OpenAI
  • Natural language explanations
  • Interface labels and status text

Example: Dynamic Switching

translator.setLanguage("en");
const english = await translator.query("What are the top 5 products?");
console.log(english.naturalResponse); // Response in English

translator.setLanguage("es");
const spanish = await translator.query(
  "¿Cuáles son los 5 productos principales?",
);
console.log(spanish.naturalResponse); // Response in Spanish

API Reference

CyberMySQLOpenAI

| Method | Description | | --------------------------- | -------------------------------------------------------------------------------- | | constructor(config) | Creates a new instance with the provided configuration | | query(prompt, options?) | Translates a natural language prompt to SQL, executes it, and returns the result | | executeSQL(sql, options?) | Executes a raw SQL query directly (no translation) | | close() | Closes all database connections | | setLanguage(lang) | Sets the response language ('en' or 'es') | | getLanguage() | Returns the current language setting |

Cache Methods

| Method | Description | | ----------------------------------- | -------------------------------------------------------------- | | getCacheStats() | Returns cache statistics (hit rate, memory usage, entry count) | | clearCache() | Removes all cached entries | | enableCache() | Enables the cache system | | disableCache() | Disables the cache system | | isCacheEnabled() | Returns whether the cache is currently active | | invalidateCacheByTable(tableName) | Removes cached entries related to a specific table |

Query Options

const result = await translator.query(
  "What was the month with highest sales?",
  {
    detailed: true, // Request a detailed analytical response
    bypassCache: true, // Skip cache and force a fresh query
  },
);

console.log("Simple response:", result.naturalResponse);
console.log("Detailed response:", result.detailedResponse);

Response Types

TranslationResult (returned by query):

| Field | Type | Description | | ------------------ | -------------- | ------------------------------------------------------------- | | sql | string | The generated SQL query | | results | any[] | Query results from the database | | reflections | Reflection[] | Error correction history (if any) | | attempts | number | Total execution attempts | | success | boolean | Whether the query succeeded | | confidence | number? | Confidence score (0–1), available when using function calling | | naturalResponse | string | Human-readable explanation | | detailedResponse | string | Detailed analysis (when detailed: true) | | executionTime | number | Total execution time in milliseconds | | fromCache | boolean | Whether the result was served from cache |


Troubleshooting

Nodemon Restarts

If Nodemon restarts constantly due to log file generation, add this to your package.json or nodemon.json:

{
  "nodemonConfig": {
    "ignore": ["*.log", "tmp/*", "logs/*"]
  }
}

Detailed Response Issues

  1. Update to the latest version: npm update cyber-mysql-openai
  2. Verify your OpenAI API key has sufficient credits
  3. Reduce log verbosity with logLevel: 'warn' or logLevel: 'error'

Log Configuration

// Disable all logging
const translator = new CyberMySQLOpenAI({
  logEnabled: false,
});

// Log only errors
const translator = new CyberMySQLOpenAI({
  logLevel: "error",
});

Project Status

This project is in stable release and under active development. Contributions and feedback are welcome.

Current Limitations

  • Very complex queries may require multiple correction iterations
  • Some advanced SQL constructs may not be interpreted correctly
  • Performance depends on database schema complexity and OpenAI model latency

Roadmap

  • ~~Business context and schema metadata~~ (shipped in v0.2.0)
  • ~~Foreign key relationship detection~~ (shipped in v0.2.0)
  • ~~Function calling with structured output~~ (shipped in v0.2.0)
  • ~~Few-shot example support~~ (shipped in v0.2.0)
  • ~~Custom instructions and response styles~~ (shipped in v0.3.0)
  • ~~Query validation layer~~ (shipped in v0.3.0)
  • ~~Schema caching with configurable TTL~~ (shipped in v0.3.0)
  • ~~Token usage tracking and cost estimation~~ (shipped in v0.3.0)
  • ~~Query history and performance stats~~ (shipped in v0.3.0)
  • ~~Reflection context injection bugfix~~ (shipped in v0.3.1)
  • ~~Compressed schema format (66% fewer tokens)~~ (shipped in v0.3.2)
  • ~~lightModel for subtasks (85% cheaper reflections)~~ (shipped in v0.3.2)
  • ~~Filtered schema in reflections~~ (shipped in v0.3.2)
  • Support for additional SQL dialects
  • Streaming responses
  • Expanded documentation and usage examples

License

MIT