@luciformresearch/codeparsers
v0.3.0
Published
Unified code parsers for TypeScript, Python, HTML, CSS, Vue, Svelte, Markdown and more with tree-sitter WASM bindings. Works in Node.js and Browser.
Maintainers
Readme
@luciformresearch/codeparsers
Multi-language code parsing library with tree-sitter WASM bindings, parallel processing, and cross-file relationship resolution. Works in both Node.js and Browser environments.
License - Luciform Research Source License (LRSL) v1.1
2025 Luciform Research. All rights reserved except as granted below.
Free to use for:
- Research, education, personal exploration
- Freelance or small-scale projects (gross monthly revenue up to 100,000 EUR)
- Internal tools (if your company revenue is up to 100,000 EUR/month)
Commercial use above this threshold requires a separate agreement.
Contact for commercial licensing: [email protected]
Grace period: 60 days after crossing the revenue threshold
Full text: LICENSE
Note: This is a custom "source-available" license, NOT an OSI-approved open source license.
Features
- Multi-language support: TypeScript, JavaScript, Python, Rust, Go, C, C++, C#, HTML, CSS, SCSS, Vue, Svelte, Markdown
- Parallel processing: Worker threads with Piscina for true parallelism
- Cross-file relationships: Automatic resolution of imports, inheritance, function calls
- Virtual file support: Parse from memory with
contentMap(no disk I/O) - Tree-sitter based: Robust, production-ready parsing with WASM bindings
- Node.js + Browser: Works in both environments
- ESM-only: Modern ES modules for Node.js 18+
Supported Languages
Code Languages (with scope extraction & relationships)
| Language | Extensions | Features |
|----------|------------|----------|
| TypeScript | .ts, .tsx, .mts, .cts | Scopes, imports, inheritance, decorators |
| JavaScript | .js, .jsx, .mjs, .cjs | Scopes, imports, classes |
| Python | .py, .pyi | Scopes, imports, decorators, docstrings |
| Rust | .rs | Scopes, modules, traits, impl blocks |
| Go | .go | Scopes, packages, interfaces, structs |
| C | .c, .h | Functions, structs, includes |
| C++ | .cpp, .cc, .cxx, .hpp, .hxx | Classes, namespaces, templates |
| C# | .cs | Classes, interfaces, namespaces |
Non-Code Languages
| Type | Extensions | Features |
|------|------------|----------|
| Markdown | .md, .mdx | Sections, code blocks, links |
| HTML | .html, .htm | DOM tree, scripts, styles |
| CSS | .css | Selectors, properties, media queries |
| SCSS | .scss, .sass | Variables, mixins, nesting |
| Vue | .vue | SFC parsing, script/template/style |
| Svelte | .svelte | Components, scripts, styles |
Installation
npm install @luciformresearch/codeparsersQuick Start
ProjectParser (Recommended for Projects)
The high-level API for parsing entire projects with parallel workers and automatic relationship resolution:
import { ProjectParser } from '@luciformresearch/codeparsers';
const parser = new ProjectParser({ maxWorkers: 4 });
const result = await parser.parseProject({
root: '/path/to/project',
files: ['src/index.ts', 'src/utils.ts', 'src/types.ts'],
resolveRelationships: true, // Enable cross-file relationship resolution
});
// Access parsed files
for (const [filePath, analysis] of result.files) {
console.log(`${filePath}: ${analysis.scopes.length} scopes`);
}
// Access relationships
if (result.relationships) {
for (const rel of result.relationships.relationships) {
console.log(`${rel.fromName} --[${rel.type}]--> ${rel.toName}`);
// e.g., "UserService --[CONSUMES]--> DatabaseClient"
// e.g., "AdminUser --[INHERITS_FROM]--> BaseUser"
}
}
// Cleanup
await parser.destroy();Virtual File Parsing (No Disk I/O)
Parse files from memory without touching the filesystem:
import { ProjectParser } from '@luciformresearch/codeparsers';
const parser = new ProjectParser();
// Create content map with virtual files
const contentMap = new Map<string, string>();
contentMap.set('/virtual/src/index.ts', `
import { helper } from './utils';
export function main() { return helper(); }
`);
contentMap.set('/virtual/src/utils.ts', `
export function helper() { return 42; }
`);
const result = await parser.parseProject({
root: '/virtual',
files: ['/virtual/src/index.ts', '/virtual/src/utils.ts'],
contentMap, // Pass pre-read content
resolveRelationships: true,
});
// Relationships are resolved even for virtual files!
// main() --[CONSUMES]--> helper()
await parser.destroy();NonCodeProjectParser (Markdown, CSS, HTML, Vue, Svelte)
import { NonCodeProjectParser } from '@luciformresearch/codeparsers';
const parser = new NonCodeProjectParser({ maxWorkers: 4 });
const result = await parser.parseFiles({
files: [
{ path: 'README.md', content: '# Title\n\nContent...' },
{ path: 'styles.css', content: '.container { display: flex; }' },
{ path: 'App.vue', content: '<template>...</template><script>...</script>' },
],
});
// Access results by type
console.log(result.markdownFiles.get('README.md'));
console.log(result.cssFiles.get('styles.css'));
console.log(result.vueFiles.get('App.vue'));
await parser.destroy();Cross-File Relationship Resolution
The relationship resolver automatically detects:
| Relationship | Description | Example |
|--------------|-------------|---------|
| CONSUMES | Scope A uses Scope B | userService.getUser() → UserService.getUser |
| INHERITS_FROM | Class extends another | class Admin extends User |
| IMPLEMENTS | Class implements interface | class UserRepo implements IRepository |
| DECORATES | Decorator on a scope | @Injectable() class Service |
| PARENT_OF | Parent contains child | class User { getName() } |
// Example output
{
type: 'CONSUMES',
fromFile: 'src/controllers/user.controller.ts',
fromName: 'getUser',
fromType: 'method',
toFile: 'src/services/user.service.ts',
toName: 'findById',
toType: 'method',
metadata: {
viaImport: true,
importPath: '../services/user.service'
}
}Low-Level Parsers
For fine-grained control, use individual parsers:
TypeScript/JavaScript
import { TypeScriptLanguageParser } from '@luciformresearch/codeparsers';
const parser = new TypeScriptLanguageParser();
await parser.initialize();
const result = await parser.parseFile('example.ts', `
export class UserService {
constructor(private db: Database) {}
async getUser(id: string): Promise<User> {
return this.db.findById(id);
}
}
`);
console.log(result.scopes);
// [
// { type: 'class', name: 'UserService', ... },
// { type: 'method', name: 'constructor', parent: 'UserService', ... },
// { type: 'method', name: 'getUser', parent: 'UserService', ... }
// ]Python
import { PythonLanguageParser } from '@luciformresearch/codeparsers';
const parser = new PythonLanguageParser();
await parser.initialize();
const result = await parser.parseFile('script.py', `
@dataclass
class Person:
"""Represents a person."""
name: str
age: int
def greet(self) -> str:
return f"Hello, I'm {self.name}"
`);
// Includes docstrings and decorators
console.log(result.scopes[0].docstring); // "Represents a person."
console.log(result.scopes[0].decorators); // ["dataclass"]Rust
import { RustLanguageParser } from '@luciformresearch/codeparsers';
const parser = new RustLanguageParser();
await parser.initialize();
const result = await parser.parseFile('lib.rs', `
pub struct Config {
pub name: String,
}
impl Config {
pub fn new(name: &str) -> Self {
Self { name: name.to_string() }
}
}
pub trait Configurable {
fn configure(&self, config: &Config);
}
`);API Reference
ProjectParser
interface ProjectParserOptions {
maxWorkers?: number; // Default: CPU count - 1
verbose?: boolean;
}
interface ParseProjectOptions {
root: string; // Project root
files: string[]; // Files to parse
contentMap?: Map<string, string>; // Virtual file contents
resolveRelationships?: boolean; // Default: true
tsConfigPath?: string; // For TS import resolution
}
interface ProjectAnalysis {
files: Map<string, ScopeFileAnalysis>;
relationships?: RelationshipResolutionResult;
stats: {
totalFiles: number;
successfulFiles: number;
failedFiles: number;
totalScopes: number;
parseTimeMs: number;
relationshipTimeMs?: number;
};
errors: Array<{ file: string; error: string }>;
}NonCodeProjectParser
interface NonCodeProjectParserOptions {
maxWorkers?: number;
verbose?: boolean;
}
interface ParseFileInput {
path: string;
content: string;
type?: 'markdown' | 'css' | 'scss' | 'html' | 'vue' | 'svelte';
}
interface NonCodeParseAnalysis {
markdownFiles: Map<string, MarkdownParseResult>;
cssFiles: Map<string, CSSParseResult>;
scssFiles: Map<string, SCSSParseResult>;
htmlFiles: Map<string, HTMLParseResult>;
vueFiles: Map<string, VueSFCParseResult>;
svelteFiles: Map<string, SvelteParseResult>;
genericFiles: Map<string, GenericFileAnalysis>;
stats: { ... };
errors: Array<{ file: string; error: string }>;
}Utility Functions
// Code files
import {
getSupportedCodeExtensions, // ['.ts', '.tsx', '.py', '.rs', ...]
isCodeParserSupported, // Check if file is supported
detectLanguageFromPath, // Get language from path
} from '@luciformresearch/codeparsers';
// Non-code files
import {
getSupportedNonCodeExtensions, // ['.md', '.css', '.vue', ...]
isNonCodeParserSupported,
detectNonCodeParserType,
} from '@luciformresearch/codeparsers';Browser Setup
For browser usage, WASM files must be served from a URL:
import { WasmLoader } from '@luciformresearch/codeparsers/wasm';
// Configure WASM location
WasmLoader.configure({
wasmBaseUrl: '/assets/wasm'
});
// Copy WASM files to your public directory:
// node_modules/@luciformresearch/codeparsers/dist/esm/wasm/grammars/*.wasm
// node_modules/web-tree-sitter/tree-sitter.wasmPerformance
With parallel workers, parsing is significantly faster:
| Project Size | Parallel (4 workers) | |--------------|---------------------| | 100 files | ~1s | | 500 files | ~5s | | 1000 files | ~10s |
Measured on TypeScript files with relationship resolution
License
LRSL v1.1 - See LICENSE file for details.
Related Projects
- RagForge Core - RAG toolkit using this parser
- tree-sitter - The underlying parsing technology
