mongoose-schema-extractor
v1.5.0
Published
Extract and format Mongoose schemas for docs, TypeScript, GraphQL, and LLM prompts.
Downloads
47
Maintainers
Readme
Mongoose Schema Extractor
Extract Mongoose schemas and feed them directly to AI models. Built to solve the real problem of dynamically integrating database schemas with LLMs for natural language database interactions.
Why This Exists
The Problem: You're building a MongoDB copilot, database chat bot, or any AI agent that needs to understand your database structure. You can't hardcode schemas manually—they change, you have multiple models, and you need this to work programmatically across different projects.
The Solution: This library extracts your existing Mongoose schemas and formats them perfectly for AI consumption. No manual schema documentation. No keeping things in sync. Just dynamic extraction that works with your actual models.
Quick Start
npm install mongoose-schema-extractorPeer dependency:
mongoose>= 6.0.0
const { extractSchemas } = require('mongoose-schema-extractor');
const mongoose = require('mongoose');
// Load your existing models (you already have these)
require('./models/User');
require('./models/Post');
// Extract schemas in AI-optimized format
const schemaContext = extractSchemas(mongoose, { format: 'llm-compact' });
// Or switch to TOON for max token efficiency
const toonContext = extractSchemas(mongoose, { format: 'toon' });
// Now feed this to ChatGPT, Claude, or any LLM
const prompt = `
Database Schema:
${schemaContext}
Query: "Find all active users who posted this week"
Generate MongoDB query:
`;Main Use Cases
1. AI Database Copilot
Build ChatGPT-like interfaces for your MongoDB database:
class DatabaseAI {
constructor() {
this.schemaContext = extractSchemas(mongoose, { format: 'llm-compact' });
}
async naturalLanguageQuery(userQuestion) {
const prompt = `Schema: ${this.schemaContext}\nQuestion: "${userQuestion}"\nMongoDB Query:`;
return await this.callOpenAI(prompt);
}
}
// Usage: "Show me users who signed up last month"
// AI generates: db.users.find({ createdAt: { $gte: new Date('2024-08-01') } })2. CLI Tool for AI Chat Sessions
Sometimes you just want to ask ChatGPT about your database:
npx mongoose-extract init
# Edit config to point to your models
npx mongoose-extract
# Generates schema.llm-compact.txt (and schema.toon if enabled) - copy/paste into ChatGPTAdd toon to the config to emit Token-Oriented Object Notation alongside other formats:
// mongoose-extract.config.js
module.exports = {
bootstrap: async () => {
const mongoose = require('mongoose');
require('./models'); // load everything
return mongoose;
},
output: {
path: './schema',
formats: ['llm-compact', 'toon']
}
};How It Works
The library reads your actual Mongoose models (not your code files) and extracts:
- Field types and constraints
- Relationships between models
- Validation rules
- Required fields
- Default values
Then formats everything in a compact, AI-friendly format:
**User**
- username (String, required, unique, 3-30 chars)
- email (String, required, unique, lowercase)
- posts (Array of ObjectId, ref: Post)
- createdAt (Date, auto-generated)
**Post**
- title (String, required, max 200 chars)
- content (String, required)
- author (ObjectId, ref: User, required)
- publishedAt (Date, default: null)API Reference
extractSchemas(input, options)
Parameters:
input- Your mongoose instance, single model, or array of modelsoptions.format- Output format:'llm-compact'- AI-optimized (primary use case)'json'- Raw JSON data'typescript'- Generate TypeScript interfaces'graphql'- Generate GraphQL schema'toon'- Token-Oriented Object Notation (TOON) for ultra token-efficient prompts
Examples:
// All registered models
extractSchemas(mongoose)
// Single model
extractSchemas(UserModel)
// Specific models
extractSchemas([UserModel, PostModel])
// With options
extractSchemas(mongoose, {
format: 'llm-compact',
include: ['validators', 'defaults'],
exclude: ['timestamps']
})TypeScript Support
Works out of the box with TypeScript projects. The tool automatically detects TypeScript projects and registers the necessary loaders.
Requirements for TypeScript:
# Install these dependencies in your project:
npm install --save-dev ts-node tsconfig-paths
# or
yarn add --dev ts-node tsconfig-pathsUsage:
// mongoose-extract.config.js
module.exports = {
bootstrap: async () => {
const mongoose = require('mongoose');
// TypeScript models are automatically compiled and path aliases resolved
require('./src/models/user.model.ts');
require('./src/models/post.model.ts');
// Path aliases from tsconfig.json work automatically:
// require('./src/models/user.model.ts'); // Uses @/models/* internally
return mongoose;
}
};What's auto-detected:
- ✅ TypeScript compilation via
ts-node - ✅ Path aliases from
tsconfig.jsonviatsconfig-paths - ✅ Automatic setup when
tsconfig.jsonexists - ✅ Helpful error messages if dependencies are missing
Real-World Integration
With OpenAI API
const schemaContext = extractSchemas(mongoose, { format: 'llm-compact' });
const response = await openai.chat.completions.create({
messages: [{
role: 'system',
content: `Database schema: ${schemaContext}`
}, {
role: 'user',
content: 'Find all orders from last week'
}],
model: 'gpt-4'
});With Langchain
const context = extractSchemas(mongoose, { format: 'llm-compact' });
const chain = new ConversationChain({
llm: new ChatOpenAI(),
memory: new BufferMemory(),
prompt: PromptTemplate.fromTemplate(`
Schema: ${context}
Human: {input}
AI: I'll help you query this database.
`)
});Output Formats
While the primary use case is LLM integration, we also support:
- JSON: Clean data for processing
- TypeScript: Generate interfaces for your frontend
- GraphQL: Schema definitions
- TOON: Token-Oriented Object Notation for compressed, LLM-ready context
// Generate TypeScript types for your frontend
const types = extractSchemas(mongoose, { format: 'typescript' });
fs.writeFileSync('types/database.d.ts', types);TOON format
TOON (Token-Oriented Object Notation) is a compact serialization designed for LLM inputs. It preserves full JSON fidelity while trimming repeated keys, so you can ship much larger schema contexts into models without blowing through context tokens. When you run the CLI or extractSchemas with { format: 'toon' }, the package uses @toon-format/toon under the hood and emits a .toon file (or string) that you can drop straight into prompts. TOON shines for uniform collections of models/fields; stick with JSON or llm-compact if you prefer a more free-form text layout.
Troubleshooting
TypeScript Issues
Error: "ts-node not found"
# Install ts-node in your project
npm install --save-dev ts-nodeError: "Cannot resolve module" or path alias issues
# Install tsconfig-paths for path alias support
npm install --save-dev tsconfig-pathsError: "Bootstrap function failed"
- Check that all your model files exist and have valid syntax
- Ensure your
tsconfig.jsonis properly configured - Verify your models export the Mongoose models correctly
General Issues
Error: "No models found"
- Make sure your bootstrap function actually loads/requires your model files
- Verify the models are registered with Mongoose before returning the mongoose instance
Generated schema looks incomplete
- Check that all your models are loaded in the bootstrap function
- Ensure models are properly exported from their files
Contributing
Found a bug? Want to add support for a new output format? PRs welcome.
Focus areas:
- Better AI prompt optimization
- New output formats
- Performance improvements
License
MIT
