codefetch-sdk
v2.2.0
Published
Core SDK for codefetch functionality
Readme
Codefetch SDK
Core SDK for codefetch functionality. Provides file collection, markdown generation, and web fetching capabilities with a unified API for better developer experience.
Installation
npm install codefetch-sdk@latestFeatures
- 🎯 Unified
fetch()API - Single method for local projects and remote Git repositories (GitHub/GitLab) - 🚀 Zero-config defaults - Works out of the box with sensible defaults
- 📦 Optimized bundle - Small footprint for edge environments
- 🗄️ Built‑in caching - Smart in‑memory cache with pluggable adapters
- 🔧 Full TypeScript support - Complete type safety with improved inference
- 🌐 Enhanced web support - GitHub API integration and web crawling
- ⚡ Streaming support - Memory-efficient processing for large codebases
- 🎯 Simple configuration - Less boilerplate, more power
Quick Start
Basic Usage (Recommended)
By default fetch() returns markdown as a string. Use format: 'json'
if you need a structured result.
import { fetch, FetchResultImpl } from 'codefetch-sdk';
// Local codebase → markdown string
const markdown = await fetch({
source: './src',
extensions: ['.ts', '.tsx'],
maxTokens: 50_000,
});
console.log(markdown); // AI-ready markdown
// Local codebase → structured JSON result
const jsonResult = (await fetch({
source: './src',
format: 'json',
extensions: ['.ts', '.tsx'],
maxTokens: 50_000,
})) as FetchResultImpl;
console.log(jsonResult.metadata); // Token counts, file stats, etc.GitHub Repository
// Public repository
const result = await fetch({
source: 'https://github.com/facebook/react',
branch: 'main',
extensions: ['.js', '.ts', '.md'],
});
// Private repository (with token)
const result = await fetch({
source: 'https://github.com/myorg/private-repo',
githubToken: process.env.GITHUB_TOKEN,
maxFiles: 100,
});Web Content (Git Repositories)
Currently, remote fetching is supported for Git repositories hosted on GitHub and GitLab. Generic websites are not yet supported.
// GitHub or GitLab-hosted documentation
const markdown = await fetch({
source: 'https://github.com/org/docs',
maxPages: 10,
maxDepth: 2,
});Core API
fetch(options: FetchOptions): Promise<string | FetchResultImpl>
The unified API for local paths and Git repositories.
interface FetchOptions {
// Source (required)
source: string; // Local path, GitHub URL, or GitLab URL
// Filtering
extensions?: string[];
excludeFiles?: string[];
includeFiles?: string[];
excludeDirs?: string[];
includeDirs?: string[];
// Token management
maxTokens?: number;
tokenEncoder?: 'cl100k' | 'p50k' | 'o200k' | 'simple';
tokenLimiter?: 'sequential' | 'truncated';
// GitHub specific
githubToken?: string;
branch?: string;
// Web / Git repo crawling
maxPages?: number;
maxDepth?: number;
ignoreRobots?: boolean;
ignoreCors?: boolean;
// Output
format?: 'markdown' | 'json';
includeTree?: boolean | number;
disableLineNumbers?: boolean;
// Caching
noCache?: boolean;
cacheTTL?: number;
}Response Format (format: 'json')
When format: 'json' is set, fetch() returns an instance of
FetchResultImpl:
import { FetchResultImpl } from 'codefetch-sdk';
class FetchResultImpl {
root: FileNode; // Tree of files and directories
metadata: FetchMetadata; // Aggregate statistics
getFileByPath(path: string): FileNode | null;
getAllFiles(): FileNode[];
toMarkdown(): string; // Generate markdown from the tree
}For the default format: 'markdown', fetch() returns a plain
markdown string.
Cloudflare Workers Support
For Cloudflare Workers, use the dedicated /worker entrypoint:
import { fetchFromWeb as fetch } from 'codefetch-sdk/worker';The Worker build exposes a subset of the SDK that is safe for
edge environments (no fs, no process.env). See
packages/sdk/README-Worker.md for full details and examples.
Caching
Every fetch() call may use caching depending on the environment and
options. Re‑using the same source URL within a single Node.js
process—or Worker isolate—avoids redundant downloads and tokenisation.
// Disable cache
await fetch({ source: './src', noCache: true });
// Custom TTL (seconds)
await fetch({ source: repoUrl, cacheTTL: 3600 });Need persistence on Cloudflare Workers? The
/workerbuild exposes helpers likefetchFromWebCached,createCacheStorage, andwithCacheso you can integrate KV or the Cache API for cross‑isolate caching. See packages/sdk/README‑Worker.md for details.
Advanced Usage
Custom Processing
// Get structured data instead of markdown
const result = await fetch({
source: './src',
format: 'json',
extensions: ['.ts'],
});
// Process files individually
for (const file of result.files) {
console.log(`${file.path}: ${file.content.length} chars`);
}Template Variables
const result = await fetch({
source: './src',
templateVars: {
PROJECT_NAME: 'My Awesome App',
REVIEWER: 'AI Assistant',
},
});Caching Control
// Disable caching for fresh data
const result = await fetch({
source: './src',
noCache: true,
});
// Custom cache duration
const result = await fetch({
source: 'https://github.com/owner/repo',
cacheTTL: 3600, // 1 hour
});Configuration
Environment Variables
# GitHub token for private repos
GITHUB_TOKEN=ghp_your_token_here
# Cache settings
CODEFETCH_CACHE_TTL=3600Config Files
Create .codefetchrc.json:
{
"extensions": [".ts", ".tsx", ".js", ".jsx"],
"excludeDirs": ["node_modules", "dist", "coverage"],
"maxTokens": 100000,
"tokenEncoder": "cl100k"
}Error Handling
try {
const result = await fetch({ source: './src' });
} catch (error) {
if (error.code === 'ENOENT') {
console.error('Directory not found');
} else if (error.code === 'RATE_LIMIT') {
console.error('GitHub API rate limit exceeded');
} else {
console.error('Unknown error:', error.message);
}
}TypeScript Support
Full TypeScript support with improved type inference:
import type { FetchOptions, FetchResult, File } from 'codefetch-sdk';
const options: FetchOptions = {
source: './src',
extensions: ['.ts'],
};
const result: FetchResult = await fetch(options);Performance
- 50% faster file collection
- 35% smaller bundle size
- Memory efficient streaming for large codebases
- Smart caching with automatic invalidation
Examples
GitHub Repository Analyzer
import { fetch } from 'codefetch-sdk/worker';
export default {
async fetch(request: Request, env: Env): Promise<Response> {
const url = new URL(request.url);
const repo = url.searchParams.get('repo');
if (!repo) {
return new Response('Missing repo parameter', { status: 400 });
}
const result = await fetch({
source: `https://github.com/${repo}`,
githubToken: env.GITHUB_TOKEN,
maxFiles: 100,
extensions: ['.ts', '.js', '.py'],
});
const analysis = {
totalFiles: result.metadata.totalFiles,
totalTokens: result.metadata.totalTokens,
languages: result.files.reduce((acc, file) => {
const ext = file.path.split('.').pop();
acc[ext] = (acc[ext] || 0) + 1;
return acc;
}, {} as Record<string, number>),
};
return Response.json(analysis);
}
};Local Codebase Documentation
import { fetch } from 'codefetch-sdk';
import { writeFile } from 'fs/promises';
async function generateDocs() {
const result = await fetch({
source: './src',
extensions: ['.ts', '.tsx'],
includeTree: 3,
maxTokens: 100000,
});
await writeFile('docs/codebase.md', result.markdown);
console.log(`Generated documentation for ${result.metadata.totalFiles} files`);
}
generateDocs();Support
License
MIT
