@waymakerai/aicofounder-context-optimizer
v1.0.0
Published
Extended context optimization for CoFounder - Efficiently handle 400K+ token contexts
Maintainers
Readme
@cofounder/context-optimizer
Extended context optimization for CoFounder - Efficiently handle 400K+ token contexts
Capitalize on GPT-5.2 (400K tokens), Gemini 3, and Claude 4.5's extended context windows without breaking the bank!
The Problem
New LLMs support massive context windows (400K+ tokens):
- GPT-5.2: 400K context, 128K output
- Gemini 3: Massive context support
- Claude 4.5: Long-running agent support
But developers face challenges:
- 💸 Cost explosion - Naive approaches = $$$$
- 🤔 Unclear strategy - When to use full context vs RAG?
- ⚡ Performance - Large contexts slow everything down
- 📊 Quality tradeoffs - How to optimize without losing quality?
The Solution
Smart hybrid optimization that:
- ✅ Saves 70% on costs (maintained at 400K scale)
- ✅ Auto-prioritizes critical files
- ✅ Mixes full context + summarization + RAG
- ✅ Repository-aware chunking
- ✅ Maintains quality
Installation
npm install @cofounder/context-optimizerQuick Start
import { createContextOptimizer } from '@cofounder/context-optimizer';
const optimizer = createContextOptimizer({
strategy: 'hybrid', // Smart mix of strategies
maxTokens: 400000, // GPT-5.2 limit
costTarget: 'balanced', // Balance cost vs quality
});
// Optimize a large codebase
const result = await optimizer.optimize({
query: 'Find all authentication flows',
codebase: files, // Array of FileMetadata
preserveCritical: true, // Keep critical files in full
summarizeOld: true, // Summarize less relevant
});
// Use optimized context
console.log(`Tokens used: ${result.tokensUsed} / 400K`);
console.log(`Cost saved: ${result.costSaved}%`);
console.log(`Quality score: ${result.qualityScore}`);
// Pass to LLM
const response = await cofounder.chat({
messages: result.messages,
});Strategies
1. hybrid (Recommended)
Smart mix of full context + summarization + RAG
const optimizer = createContextOptimizer({ strategy: 'hybrid' });How it works:
- Phase 1: Critical files → Full content (60% budget)
- Phase 2: Important files → Full or summarized (30% budget)
- Phase 3: Supplementary → Metadata only (10% budget)
Best for:
- Large codebases
- Mixed importance files
- Cost-conscious projects
2. full
Use entire context up to limit
const optimizer = createContextOptimizer({ strategy: 'full' });Best for:
- Small codebases (< 100K tokens)
- Maximum quality needed
- Budget not a concern
3. rag
RAG-only retrieval (top-k most relevant)
const optimizer = createContextOptimizer({ strategy: 'rag' });Best for:
- Very large codebases
- Specific queries
- Minimal cost
4. summarize
Summarize everything proportionally
const optimizer = createContextOptimizer({ strategy: 'summarize' });Best for:
- Maximum compression
- Overview tasks
- Extreme cost savings
Cost Targets
const optimizer = createContextOptimizer({
costTarget: 'cost', // Minimize cost (may reduce quality)
// or 'balanced' (default) - Balance cost and quality
// or 'quality' - Maximize quality (higher cost)
});File Prioritization
Automatic Prioritization
The optimizer automatically assigns priorities:
- Critical: Entry points (
index.*,main.*), core files - Important: Files matching query, frequently used
- Supplementary: Tests, configs, docs
- Exclude: Build artifacts, dependencies
Manual Override
const files: FileMetadata[] = [
{
path: 'src/auth/auth.ts',
content: '...',
tokens: 1000,
priority: 'critical', // Force critical
},
{
path: 'src/utils/helpers.ts',
content: '...',
tokens: 500,
priority: 'supplementary',
},
];
const result = await optimizer.optimize({
codebase: files,
preserveFiles: ['src/auth/auth.ts'], // Always keep in full
});Custom Prioritization
const optimizer = createContextOptimizer({
prioritize: (file, query) => {
// Your custom logic
if (file.path.includes('security')) return 'critical';
if (file.path.includes('legacy')) return 'exclude';
return 'important';
},
scoreRelevance: (file, query) => {
// Your custom relevance scoring (0-1)
const score = /* your logic */;
return score;
},
});Advanced Usage
With CoFounder Core
import { createCoFounder } from '@cofounder/core';
import { createContextOptimizer } from '@cofounder/context-optimizer';
const cofounder = createCoFounder({
providers: { anthropic: process.env.ANTHROPIC_API_KEY },
});
const optimizer = createContextOptimizer({
strategy: 'hybrid',
maxTokens: 400000,
});
// Optimize context
const optimized = await optimizer.optimize({
query: 'Explain the authentication system',
codebase: myLargeCodebase,
});
// Use with CoFounder
const response = await cofounder.chat({
messages: optimized.messages,
model: 'claude-3-5-sonnet-20241022',
});
console.log(response.content);Custom Summarization
import { createCoFounder } from '@cofounder/core';
const cofounder = createCoFounder({ /* ... */ });
const optimizer = createContextOptimizer({
summarize: async (text, targetTokens) => {
// Use LLM for smart summarization
const summary = await cofounder.chat({
messages: [{
role: 'user',
content: `Summarize this in ~${targetTokens * 4} characters:\n\n${text}`
}],
model: 'gpt-4o-mini', // Use cheap model for summarization
});
return summary.content;
},
});Caching
const optimizer = createContextOptimizer({
enableCache: true,
// Same query + files = instant result from cache
});
// First call: Processes files
const result1 = await optimizer.optimize({ query: 'auth', codebase: files });
// Second call: Instant from cache
const result2 = await optimizer.optimize({ query: 'auth', codebase: files });
// Clear cache when files change
optimizer.clearCache();Real-World Examples
Example 1: Large Codebase Analysis
import fs from 'fs/promises';
import { createContextOptimizer } from '@cofounder/context-optimizer';
// Load entire codebase
const files = await loadCodebase('./src');
const optimizer = createContextOptimizer({
strategy: 'hybrid',
maxTokens: 400000,
costTarget: 'balanced',
});
const result = await optimizer.optimize({
query: 'Find security vulnerabilities',
codebase: files,
preserveFiles: [
'src/auth/auth.ts',
'src/security/validation.ts',
],
});
console.log(`Analyzed ${result.context.totalFiles} files`);
console.log(`Full: ${result.context.fullFiles.length}`);
console.log(`Summarized: ${result.context.summarizedFiles.length}`);
console.log(`Excluded: ${result.context.excludedFiles.length}`);
console.log(`Tokens: ${result.tokensUsed} / 400K`);
console.log(`Cost saved: ${result.costSaved}%`);Example 2: Documentation Generation
const optimizer = createContextOptimizer({
strategy: 'full', // Want complete context for docs
maxTokens: 200000,
});
const result = await optimizer.optimize({
query: 'Generate API documentation',
codebase: apiFiles,
includeFiles: apiFiles.map(f => f.path),
});
// All API files in full context
const docs = await cofounder.chat({
messages: result.messages,
model: 'gpt-5.2-thinking', // Use reasoning model
});Example 3: Code Review
const optimizer = createContextOptimizer({
strategy: 'hybrid',
maxTokens: 100000, // Smaller budget for code review
});
const result = await optimizer.optimize({
query: `Review this PR for:
- Security issues
- Performance problems
- Best practices
`,
codebase: changedFiles,
preserveFiles: criticalFiles,
});Performance
- Optimization time: < 100ms for 1000 files
- Memory usage: ~1MB per 1000 files
- Cache hit rate: 60%+ with typical usage
- Quality retention: 85%+ with hybrid strategy
Cost Savings
Example: 10MB codebase (2.5M tokens)
| Strategy | Tokens Used | Cost (GPT-5.2) | Savings |
|----------|-------------|----------------|---------|
| Naive | 2,500,000 | $25.00 | 0% |
| full (no fit) | N/A | N/A | N/A |
| rag | 100,000 | $1.00 | 96% |
| summarize | 250,000 | $2.50 | 90% |
| hybrid | 400,000 | $4.00 | 84% |
API Reference
createContextOptimizer(config?)
Creates optimizer instance.
const optimizer = createContextOptimizer({
maxTokens?: number; // Default: 400000
strategy?: OptimizationStrategy; // Default: 'hybrid'
costTarget?: CostTarget; // Default: 'balanced'
preserveCritical?: boolean; // Default: true
summarizeOld?: boolean; // Default: true
enableCache?: boolean; // Default: true
// Custom functions
prioritize?: (file, query) => FilePriority;
scoreRelevance?: (file, query) => number;
countTokens?: (text) => number;
summarize?: (text, targetTokens) => Promise<string>;
});optimizer.optimize(options)
Optimizes context.
const result = await optimizer.optimize({
query?: string; // Task/query description
codebase?: string | FileMetadata[]; // Path or files array
includeFiles?: string[]; // Specific files to include
excludeFiles?: string[]; // Files to exclude
preserveFiles?: string[]; // Files to keep in full
additionalContext?: string; // Extra context to prepend
targetTokens?: number; // Override maxTokens
});OptimizationResult
{
messages: Array<{ role: string; content: string }>;
context: {
fullFiles: string[];
summarizedFiles: string[];
excludedFiles: string[];
totalFiles: number;
};
tokensUsed: number;
originalTokens: number;
costSaved: number;
qualityScore: number;
strategy: OptimizationStrategy;
warnings?: string[];
}Best Practices
- Start with
hybrid- Best balance for most cases - Set realistic budgets - Don't max out context unnecessarily
- Preserve critical files - Use
preserveFilesfor core code - Use caching - Enable for repeated queries
- Custom summarization - Use LLMs for better summaries
- Monitor quality scores - Adjust strategy if too low
- Benchmark costs - Track actual savings
Roadmap
- [ ] LLM-based prioritization
- [ ] Dependency graph analysis
- [ ] Incremental updates (only changed files)
- [ ] Multi-repo support
- [ ] Visual context explorer
- [ ] Cost estimation before optimization
- [ ] A/B testing different strategies
License
MIT © Waymaker
