@kiyeonjeon21/datacontext
v0.6.1
Published
AI-native database context layer - Make AI understand your data better
Maintainers
Readme
DataContext
AI-native database context layer — Make AI understand your data better.
Quick Start • VS Code Extension • Documentation • API Reference
The Problem
Every time you use AI to write SQL, you explain the same things:
"The
userstable has astatuscolumn where 1 means active..." "We have a soft-delete pattern, so adddeleted_at IS NULL..." "Theorders.totalcolumn is in cents, not dollars..."
DataContext solves this. Define your business context once, and AI understands it everywhere.
How It Works
┌─────────────────────────────────────────────────────────────────┐
│ AI Assistant (Claude, Cursor, ChatGPT) │
└─────────────────────────┬───────────────────────────────────────┘
│ MCP Protocol
┌─────────────────────────▼───────────────────────────────────────┐
│ DataContext │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Knowledge │ │ Safety │ │ Learning │ │
│ │ Layer │ │ Validator │ │ Loop │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└─────────────────────────┬───────────────────────────────────────┘
│
┌─────────────────────────▼───────────────────────────────────────┐
│ Your Database (PostgreSQL, MySQL, SQLite, MongoDB) │
└─────────────────────────────────────────────────────────────────┘Features
| Feature | Description | |---------|-------------| | 🔌 MCP Protocol | Works with Claude Desktop, Cursor IDE, and any MCP client | | 🗄️ Multi-Database | PostgreSQL, MySQL/MariaDB, SQLite, MongoDB | | 🔒 Safety First | Read-only by default, query validation, timeout protection | | 📚 Knowledge Layer | Store descriptions, business rules, query examples | | 🤖 AI Glossary | Auto-generate business terms from natural language (requires Anthropic API key) | | 🌾 Auto-Harvesting | Collect DB comments, indexes, relationships automatically | | 💰 Cost Estimation | EXPLAIN-first analysis to prevent expensive queries | | 📊 Learning Loop | Capture feedback to improve over time | | 🌐 REST API | HTTP interface for web apps | | 🧩 VS Code Extension | IDE integration with autocomplete |
Quick Start
1. Install
npm install -g @kiyeonjeon21/datacontext
# or use with npx
npx @kiyeonjeon21/datacontext --help2. Connect & Start Server
# Start REST API server (recommended for testing)
npx @kiyeonjeon21/datacontext serve postgres://user:pass@localhost:5432/mydb --port 3000
# Or start MCP server directly
npx @kiyeonjeon21/datacontext connect postgres://user:pass@localhost:5432/mydb3. AI Features (Optional)
To use AI-powered glossary generation, set your Anthropic API key:
# Option 1: Environment variable (recommended)
export ANTHROPIC_API_KEY=sk-ant-api03-...
npx @kiyeonjeon21/datacontext serve postgres://user:pass@localhost:5432/mydb
# Option 2: One-line command
ANTHROPIC_API_KEY=sk-ant-api03-... npx @kiyeonjeon21/datacontext serve postgres://...Get your API key: Anthropic Console
Note: AI features are optional. Basic functionality (queries, descriptions, harvesting) works without an API key.
4. Configure AI Client
Claude Desktop (~/Library/Application Support/Claude/claude_desktop_config.json):
{
"mcpServers": {
"datacontext": {
"command": "npx",
"args": ["@kiyeonjeon21/datacontext", "connect", "postgres://user:pass@localhost:5432/mydb"]
}
}
}Cursor IDE (Settings → MCP):
{
"mcpServers": {
"datacontext": {
"command": "npx",
"args": ["@kiyeonjeon21/datacontext", "connect", "postgres://user:pass@localhost:5432/mydb"]
}
}
}VS Code Extension
Install the DataContext VS Code extension for:
- 📊 Database Explorer — Browse tables and columns in the sidebar
- 💡 SQL Autocomplete — Intelligent completions based on your schema
- 📝 Inline Descriptions — Hover over tables to see business context
- 🌾 Harvest Command — Auto-collect metadata from your database
Install: Search "DataContext" in VS Code Extensions or:
code --install-extension kiyeonjeon21.vscode-datacontextUsage:
- Start DataContext server:
npx @kiyeonjeon21/datacontext serve ... - In VS Code:
Cmd+Shift+P→ "DataContext: Connect to Database" - Enter API URL:
http://localhost:3000
MCP Tools
When connected via MCP, AI assistants can use these tools:
| Tool | Description |
|------|-------------|
| query | Execute SQL with safety validation |
| list_tables | List all tables in schema |
| describe_table | Get table structure with context |
| get_context | Get business context for tables |
| add_description | Add table/column descriptions |
| generate_glossary | AI: Generate business terms from natural language |
| add_term | Add a business term to glossary |
| list_terms | List all business terms |
| search_terms | Search for terms matching a query |
| enhance_query | AI: Enhance query with glossary terms |
| harvest_metadata | Auto-collect DB metadata |
| estimate_cost | Analyze query cost before running |
| record_feedback | Record query corrections |
| get_metrics | View success rates and stats |
Example: Before & After
Before DataContext
User: "Get active users who spent over $100"
AI: "SELECT * FROM users WHERE ??? JOIN orders ON ???"
User: "status=1 means active, and total is in cents..."After DataContext
User: "Get active users who spent over $100"
AI: [Uses DataContext to understand schema]
SELECT u.* FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.status = 1 -- Active users
AND u.deleted_at IS NULL -- Not soft-deleted
GROUP BY u.id
HAVING SUM(o.total) > 10000 -- $100 in centsAI Glossary Example
Define business terms once, use everywhere:
# Generate glossary from natural language
curl -X POST http://localhost:3000/api/terms/generate \
-H "Content-Type: application/json" \
-d '{
"terms": "활성 사용자 = status가 1인 사용자\n최근 주문 = 30일 이내 주문"
}'
# Now AI understands your terms automatically
User: "활성 사용자 중 최근 주문한 사람 조회해줘"
AI: SELECT u.* FROM users u
JOIN orders o ON o.user_id = u.id
WHERE u.status = 1 -- 활성 사용자 (from glossary)
AND o.created_at > NOW() - INTERVAL '30 days' -- 최근 주문 (from glossary)CLI Commands
Server Commands
# Start MCP server (for AI clients like Claude/Cursor)
npx @kiyeonjeon21/datacontext connect <connection-string> [options]
# Start REST API server (for web apps)
npx @kiyeonjeon21/datacontext serve <connection-string> [options]Schema Commands
# View database schema
npx @kiyeonjeon21/datacontext schema <connection-string>
# Add table/column description
npx @kiyeonjeon21/datacontext describe <table> <description> --connection <string>Glossary Commands
# List all business terms
npx @kiyeonjeon21/datacontext glossary list
# Add a term manually
npx @kiyeonjeon21/datacontext glossary add "활성 사용자" "status가 1인 사용자" \
--sql "status = 1" \
--synonyms "active user,활성화된 사용자" \
--tables "users" \
--category "status"
# Search for terms
npx @kiyeonjeon21/datacontext glossary search "활성"
# Generate terms with AI (requires ANTHROPIC_API_KEY)
npx @kiyeonjeon21/datacontext glossary generate "최근 주문 = 30일 이내, VIP 고객 = 주문 10건 이상"
# Export to YAML/JSON
npx @kiyeonjeon21/datacontext glossary export --output glossary.yaml
# Import from YAML/JSON
npx @kiyeonjeon21/datacontext glossary import glossary.yaml
# Delete a term
npx @kiyeonjeon21/datacontext glossary delete "활성 사용자"Other Commands
# Initialize configuration
npx @kiyeonjeon21/datacontext initOptions:
--schema <name>— Default schema (default: "public")--read-only/--no-read-only— Read-only mode (default: true)--timeout <ms>— Query timeout (default: 30000)--max-rows <n>— Max rows to return (default: 1000)--port <n>— REST API port (default: 3000)
AI Features: Set ANTHROPIC_API_KEY environment variable to enable AI-powered glossary generation.
SDK Usage
Use DataContext directly in your application:
import {
createDataContextService,
createPostgresAdapter,
createKnowledgeStore,
} from '@kiyeonjeon21/datacontext';
// Create service
const service = createDataContextService({
adapter: createPostgresAdapter('postgres://...'),
knowledge: createKnowledgeStore('mydb'),
safety: {
readOnly: true,
timeoutMs: 30000,
maxRows: 1000,
},
});
// Initialize
await service.initialize();
// Query with context
const result = await service.query('SELECT * FROM users WHERE status = 1');
console.log(result.data);
console.log('Context used:', result.contextUsed);
// Add knowledge
await service.setTableDescription('users', 'Core user accounts table');
await service.addBusinessRule(
'active_users',
'Active users have status=1 and deleted_at IS NULL',
['users']
);
// Cleanup
await service.shutdown();Documentation
Supported Databases
| Database | Status | Adapter |
|----------|--------|---------|
| PostgreSQL | ✅ Full | postgres:// |
| MySQL/MariaDB | ✅ Full | mysql:// |
| SQLite | ✅ Full | ./file.sqlite |
| MongoDB | ✅ Full | mongodb:// |
| BigQuery | 🔜 Planned | — |
| Snowflake | 🔜 Planned | — |
MongoDB Support
MongoDB is schema-less, but DataContext infers schema from sample documents:
# Connect to MongoDB
datacontext serve mongodb://user:pass@localhost:27017/mydb
# Or with authentication
datacontext serve "mongodb://user:pass@localhost:27017/mydb?authSource=admin"DataContext will:
- List all collections
- Infer field types from sample documents
- Detect nested objects and arrays
- Support native MongoDB queries (find, aggregate, etc.)
License
MIT © DataContext
