sql-chatbot-agent

v1.2.1

Published

14 days ago

AI chatbot for any PostgreSQL app — auto-discovers schema, indexes code, executes SQL, streams answers via chat widget. Run with npx or use as Express middleware.

0High
0Medium
0Low

bhumit4220

chatbot sql postgresql express middleware ai llm openai

sql-chatbot-agent

AI-powered database chatbot that runs as a standalone CLI or Express middleware. Point it at any PostgreSQL database and get an AI assistant that can query your data, explain your code, and help users navigate your app.

Zero configuration required -- it auto-discovers your database schema, indexes your codebase, and serves a chat widget.

Quick Start (CLI)

The fastest way to get started is the standalone CLI. No existing Express app needed.

Install globally

npm install -g sql-chatbot-agent

Initialize a config file

sql-chatbot-agent init

This creates a chatbot.config.json in the current directory and adds it to .gitignore.

Edit the file with your database URL, then start the chatbot:

sql-chatbot-agent

The chatbot will be running at http://localhost:3456 with a chat widget ready to use.

By default, it uses OpenAI (gpt-4o-mini). See Providers for other options.

One-liner with npx

# OpenAI
npx sql-chatbot-agent --db postgresql://localhost/mydb --key sk-xxx --code ./app

# With Groq
npx sql-chatbot-agent --db postgresql://localhost/mydb --provider groq --key gsk_xxx --code ./app

# Or use environment variables
export OPENAI_API_KEY=sk-xxx
npx sql-chatbot-agent --db postgresql://localhost/mydb --code ./app

Configuration

chatbot.config.json

Created by sql-chatbot-agent init:

{
  "databaseUrl": "postgresql://user:password@localhost:5432/your_database",
  "provider": "openai",
  "llmApiKey": "",
  "codePaths": ["./src"],
  "port": 3456,
  "secret": ""
}

| Field | Type | Description | |-------|------|-------------| | databaseUrl | string | PostgreSQL connection URL (required) | | provider | string | LLM provider: openai, groq, or ollama (default: auto-detect) | | llmApiKey | string | API key for the LLM provider (not needed for Ollama) | | llmModel | string | Model name override (default: provider-specific) | | llmBaseUrl | string | API base URL override (default: provider-specific) | | codePaths | string[] | Directories to scan for route detection, enum discovery, and business logic context (default: ["./src"]) | | port | number | Port for the standalone server (default: 3456) | | secret | string | Secret token for authentication (optional, recommended for production) | | defaultFilters | object | Row-level filters injected into every grammar SELECT — see Domain Hints | | aliases | object | User-word → entity-name overrides — see Domain Hints |

Backward compat: groqApiKey is still accepted and mapped to llmApiKey automatically.

Code Indexing

The codePaths option (or --code CLI flag) controls which directories the chatbot scans for source code. This powers three features:

Route detection — navigation and guidance answers (Express, Rails, Django, Next.js, etc.)
Enum & constant discovery — model-level enums (Rails enum, Django choices, TypeORM decorators, etc.) are surfaced as context for accurate SQL generation
Business logic context — validation rules, calculations, and domain logic help the LLM generate better queries

Framework-Specific Paths

| Framework | Recommended codePaths | |-----------|------------------------| | Express / React / Next.js / Hono | ["./src"] | | Rails | ["./app", "./config"] | | Django | ["./myapp"] (your app directories) | | Laravel | ["./app", "./routes"] | | Flask / FastAPI | ["./app"] or ["."] | | Spring Boot (Java/Kotlin) | ["./src/main/java"] or ["./src/main/kotlin"] | | Go (Gin / Echo / Fiber) | ["./cmd", "./internal"] | | Phoenix / Elixir | ["./lib"] | | SvelteKit / Nuxt | ["./src"] | | ASP.NET | ["./Controllers"] | | Rust (Actix / Axum) | ["./src"] | | Sinatra | ["."] |

Tip: When in doubt, point to your project root. The indexer automatically skips node_modules, .git, dist, build, vendor, target, __pycache__, etc.

Supported File Types

.js, .ts, .jsx, .tsx, .rb, .py, .erb, .vue, .php, .java, .go, .cs, .ex, .exs, .svelte, .kt, .rs, .dart, .scala

Limits

Maximum 2000 files indexed (configurable)
Files are scanned on first request (lazy initialization)
Use /chatbot/api/refresh to re-index after code changes

CLI Flags

| Flag | Short | Description | |------|-------|-------------| | --db | | PostgreSQL connection URL | | --provider | | LLM provider: ollama, groq, or openai | | --key | | API key for the LLM provider | | --model | | Model name override | | --base-url | | API base URL override | | --code | | Directory to index (repeatable: --code ./app --code ./config) | | --port | -p | Port for the standalone server | | --secret | | Secret token for authentication |

Environment Variables

| Variable | Description | |----------|-------------| | DATABASE_URL | PostgreSQL connection URL | | LLM_PROVIDER | LLM provider: openai, groq, or ollama | | LLM_API_KEY | API key for the LLM provider | | OPENAI_API_KEY | OpenAI API key (auto-detects provider) | | GROQ_API_KEY | Groq API key (auto-detects provider when no OPENAI_API_KEY) | | LLM_MODEL | Model name override | | LLM_BASE_URL | API base URL override | | CHATBOT_SECRET | Secret token for authentication | | PORT | Port for the standalone server |

Priority Order

Configuration is resolved in this order (highest priority first):

CLI flags (--db, --key, etc.)
Environment variables (DATABASE_URL, GROQ_API_KEY, etc.)
Config file (chatbot.config.json)

Providers

sql-chatbot-agent works with any OpenAI-compatible LLM API. Three providers are preconfigured:

| Provider | Type | API Key | Default Model | Cost | |----------|------|---------|---------------|------| | OpenAI | Cloud | Required (platform.openai.com) | gpt-4o-mini | Pay-per-use | | Groq | Cloud | Required (console.groq.com) | llama-3.3-70b-versatile | Pay-per-use | | Ollama | Local | Not needed | llama3.1:8b | Free (self-hosted) |

Auto-detection

If you don't specify --provider, the chatbot auto-detects:

OPENAI_API_KEY set → uses openai
GROQ_API_KEY set (and no OPENAI key) → uses groq
Otherwise defaults to openai (requires --key or LLM_API_KEY env)

OpenAI (Default)

export OPENAI_API_KEY=sk-xxx
npx sql-chatbot-agent --db postgresql://localhost/mydb --code ./src

Groq

npx sql-chatbot-agent --db postgresql://localhost/mydb --provider groq --key gsk_xxx

Ollama (Local, Offline)

For fully offline use. Requires installing Ollama and pulling a model.

ollama pull llama3.1:8b && ollama serve
npx sql-chatbot-agent --db postgresql://localhost/mydb --provider ollama

Any OpenAI-Compatible API

Use --base-url and --model to connect to any OpenAI-compatible API:

npx sql-chatbot-agent --db postgresql://localhost/mydb \
  --base-url http://my-llm-server:8080/v1 \
  --key my-api-key \
  --model my-model-name

Authentication

For production use, set a secret token to protect the chatbot API.

Setup

Provide a secret via any configuration method:

# CLI flag
sql-chatbot-agent --secret my-secret-token

# Environment variable
CHATBOT_SECRET=my-secret-token sql-chatbot-agent

# Or in chatbot.config.json
# { "secret": "my-secret-token" }

How It Works

When a secret is configured:

Widget script (/chatbot/widget.js) -- sets a chatbot_token cookie on load
API endpoints (/chatbot/api/ask, /chatbot/api/refresh) -- validate the cookie or a Bearer token in the Authorization header
Health endpoint (/chatbot/api/health) -- remains open (no sensitive data)

Requests without a valid token receive a 401 Unauthorized response.

Without a Secret

If no secret is configured, the chatbot API is open to anyone. The CLI prints a warning on startup:

Warning: No secret configured. The chatbot API is open to anyone.

This is fine for local development but should not be used in production.

.gitignore

Add chatbot.config.json to your .gitignore to avoid committing secrets. The init command does this automatically.

Setup by Architecture

A. Standalone CLI against any framework (Django, Laravel, Flask, Spring, Go, Phoenix, Rails, Express, …)

Use this when you don't want to touch the host app's source tree.

npm install -g sql-chatbot-agent
cd /path/to/your/app
sql-chatbot-agent init
sql-chatbot-agent

Embed the widget from the chatbot's port:

<script src="http://localhost:3456/chatbot/widget.js"></script>

This is how the project's 11 sample-app sweeps run (Listmonk, Keycloak, n8n, Taiga, Directus, Umami, Miniflux, Gitea, Redmine, Chatwoot, Saleor).

Recommended codePaths per framework: see the Code Indexing table.

B. Express monolith (Express serves your UI)

npm install sql-chatbot-agent

const express = require('express');
const { sqlChatbot } = require('sql-chatbot-agent');

const app = express();
app.use('/chatbot', sqlChatbot({
  databaseUrl: process.env.DATABASE_URL,
  provider:    'openai',
  llmApiKey:   process.env.OPENAI_API_KEY,
  codePaths:   ['./src'],
  secret:      process.env.CHATBOT_SECRET,
}));

app.listen(3000);

<script src="/chatbot/widget.js"></script>

C. Node API + JS SPA on a different origin

Mount the middleware on the API as in (B). CORS is enabled by default. Load the widget from the API origin so cookies stay first-party:

<script src="https://api.example.com/chatbot/widget.js"></script>

For a fully token-based flow, set secret and have the SPA call the API with Authorization: Bearer <secret> instead of relying on cookies.

D. Pure SPA / static site (no Node backend)

Run the standalone CLI on a host that can reach your Postgres. Aim codePaths at the SPA repo so the indexer picks up route names and inline enums:

sql-chatbot-agent --db postgresql://... --code /path/to/spa/src --port 3456

<script src="https://chatbot.example.com/chatbot/widget.js"></script>

E. Microservices / multi-DB

Run one CLI (or one mounted middleware) per database. Each gets its own port, its own chatbot.config.json, its own widget script tag. Instances do not share state — each introspects its own DB independently.

Domain Hints

When schema + code indexing isn't enough to disambiguate user wording or app conventions, three knobs exist.

`defaultFilters` — row-level conventions

Injected into every grammar-generated SELECT. Keys are "table.column" (specific) or "*.column" (any entity with that column). Values are SQL fragments placed after the qualified column reference.

{
  "defaultFilters": {
    "*.status": "!= 3",
    "users.deleted_at": "IS NULL",
    "invoices.cancelled_at": "IS NULL"
  }
}

Example: "*.status": "!= 3" becomes WHERE "<table>"."status" != 3 on every SELECT against a table with a status column — unless the column already appears in the generated SQL.

`aliases` — user-word → entity-name overrides

When auto-detection misses how your schema names a concept:

{
  "aliases": {
    "customer":  "account_user",
    "customers": "account_user",
    "agent":     "user",
    "agents":    "user"
  }
}

Always overrides auto-detected aliases on conflict — the developer's mapping wins.

These two options exist on both the chatbot.config.json and the middleware sqlChatbot({ defaultFilters, aliases }) config object.

How It Works

When a user asks a question, the chatbot:

Classifies the question (data query, code question, navigation, etc.)
Routes to the right handler:
- Data questions -- generates SQL, executes it read-only, explains the results
- Code questions -- searches your indexed codebase, explains the relevant code
- Navigation -- uses detected routes and page context to guide the user
Streams the answer back via Server-Sent Events (SSE)

User: "How many active users signed up this month?"
  -> Classifies as "data" question
  -> Generates: SELECT COUNT(*) FROM users WHERE active = true AND created_at >= '2026-02-01'
  -> Executes read-only
  -> Streams: "There are 142 active users who signed up this month."

API Endpoints

When mounted at /chatbot (or running standalone), the following endpoints are available:

| Endpoint | Method | Description | |----------|--------|-------------| | /chatbot/widget.js | GET | Serves the chat widget bundle | | /chatbot/api/ask | POST | Main chat endpoint (SSE streaming) | | /chatbot/api/health | GET | Health check with table/file counts | | /chatbot/api/refresh | POST | Re-discover schema and re-index code |

POST /api/ask

Request body:

{
  "question": "How many users are there?",
  "pageContext": "{\"url\": \"...\", \"title\": \"...\"}",
  "history": [
    { "role": "user", "content": "previous question" },
    { "role": "assistant", "content": "previous answer" }
  ]
}

Response: SSE stream with events:

data: {"type":"classifying"}
data: {"type":"classified","questionType":"data","confidence":0.9}
data: {"type":"sql","sql":"SELECT COUNT(*) FROM users"}
data: {"type":"executing"}
data: {"type":"token","content":"There"}
data: {"type":"token","content":" are"}
data: {"type":"token","content":" 1,234"}
data: {"type":"token","content":" users."}
data: {"type":"done"}

GET /api/health

{
  "status": "ok",
  "tables": 25,
  "codeFiles": 150
}

Features

CLI standalone mode -- run as a standalone server with zero boilerplate
Built-in authentication -- secret token authentication with cookie and Bearer token support
Auto-discovers your database schema -- tables, columns, types, foreign keys, indexes
Indexes your codebase -- scans JS/TS/Ruby/Python files, detects Express/React/Next.js/Rails routes
SQL safety -- validates queries against blocklists, executes in READ ONLY transactions, blocks destructive operations
Sensitive data filtering -- automatically hides columns matching patterns like password, secret, api_key, ssn
Chat widget with Shadow DOM -- no CSS conflicts with your app
Conversation history -- the widget sends message history for contextual follow-ups
Multi-provider -- Ollama (local, free), Groq (cloud, free tier), OpenAI, or any OpenAI-compatible API
Lazy initialization -- schema discovery and code indexing happen on first request, not at startup

Supported Frameworks

The code indexer detects routes and patterns from 17+ frameworks:

JavaScript/TypeScript: Express.js, Fastify, Hono, Koa, React Router, Next.js (pages + app router), NestJS, SvelteKit, Nuxt Ruby: Rails, Sinatra Python: Django, FastAPI, Flask Java/Kotlin: Spring Boot Go: Gin, Echo, Fiber C#: ASP.NET (minimal APIs + attribute routing) Elixir: Phoenix

Security

All SQL queries run inside SET TRANSACTION READ ONLY -- no writes possible
Dangerous SQL keywords (DROP, DELETE, TRUNCATE, ALTER, etc.) are blocked before execution
System catalogs (pg_catalog, information_schema) are blocked from queries
Sensitive columns are automatically filtered from the schema summary sent to the LLM
Built-in secret token authentication for production deployments
chatbot.config.json is automatically added to .gitignore by the init command

Requirements

Node.js >= 18
PostgreSQL database
An LLM provider: OpenAI, Groq, Ollama (local), or any OpenAI-compatible API

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

sql-chatbot-agent

Quick Start (CLI)

Install globally

Initialize a config file

One-liner with npx

Configuration

chatbot.config.json

Code Indexing

Framework-Specific Paths

Supported File Types

Limits

CLI Flags

Environment Variables

Priority Order

Providers

Auto-detection

OpenAI (Default)

Groq

Ollama (Local, Offline)

Any OpenAI-Compatible API

Authentication

Setup

How It Works

Without a Secret

.gitignore

Setup by Architecture

A. Standalone CLI against any framework (Django, Laravel, Flask, Spring, Go, Phoenix, Rails, Express, …)

B. Express monolith (Express serves your UI)

C. Node API + JS SPA on a different origin

D. Pure SPA / static site (no Node backend)

E. Microservices / multi-DB

Domain Hints

defaultFilters — row-level conventions

aliases — user-word → entity-name overrides

How It Works

API Endpoints

POST /api/ask

GET /api/health

Features

Supported Frameworks

Security

Requirements

License

`defaultFilters` — row-level conventions

`aliases` — user-word → entity-name overrides