@bernierllc/markdown-parser
v1.2.0
Published
High-performance markdown parser with CommonMark, GFM, and plugin support
Readme
@bernierllc/markdown-parser
High-performance markdown parser with CommonMark, GitHub-Flavored Markdown (GFM), and plugin support. Converts markdown text to Abstract Syntax Tree (AST) format for further processing.
Installation
npm install @bernierllc/markdown-parserFeatures
- CommonMark Support - Full CommonMark v0.31.2 specification compliance
- GitHub-Flavored Markdown - Task lists, tables, strikethrough, autolinks
- Plugin System - Extend parser with custom transformations
- Caching - Optional in-memory caching with TTL and size limits
- Performance - Fast parsing with position tracking
- TypeScript - Strict typing with comprehensive type definitions
- Zero Dependencies - Only uses @bernierllc packages
Usage
Basic Parsing
import { MarkdownParser } from '@bernierllc/markdown-parser';
const parser = new MarkdownParser();
const result = await parser.parse('# Hello World\n\nThis is **bold** text.');
if (result.success) {
console.log('Parsed AST:', result.ast);
console.log('Node count:', result.metadata.nodeCount);
console.log('Parse time:', result.metadata.parseTime, 'ms');
}With Caching
const parser = new MarkdownParser({
cache: {
enabled: true,
maxSize: 1000, // Maximum 1000 cached documents
ttl: 3600, // Cache for 1 hour
},
});
// First parse - not cached
const result1 = await parser.parse('# Document');
console.log('Cached:', result1.metadata.cached); // false
// Second parse - cached
const result2 = await parser.parse('# Document');
console.log('Cached:', result2.metadata.cached); // true
console.log('Faster:', result2.metadata.parseTime < result1.metadata.parseTime); // trueWith Custom Plugins
import { MarkdownParser, MarkdownPlugin, transformNodes } from '@bernierllc/markdown-parser';
// Create a plugin that converts all text to uppercase
const uppercasePlugin: MarkdownPlugin = {
name: 'uppercase',
version: '1.0.0',
transform: (ast) => {
return transformNodes(ast, (node) => {
if (node.type === 'text' && node.value) {
return { ...node, value: node.value.toUpperCase() };
}
return node;
});
},
};
const parser = new MarkdownParser({
plugins: [uppercasePlugin],
});
const result = await parser.parse('# hello world');
// Result: heading with text "HELLO WORLD"GFM Features
const parser = new MarkdownParser({ gfm: true });
// Task lists
const tasks = await parser.parse(`
- [ ] Unchecked task
- [x] Checked task
`);
// Tables
const table = await parser.parse(`
| Header 1 | Header 2 |
| -------- | -------- |
| Cell 1 | Cell 2 |
`);
// Strikethrough
const strikethrough = await parser.parse('This is ~~deleted~~ text');
// Autolinks
const autolink = await parser.parse('Visit https://example.com for more');With Logger
import { Logger } from '@bernierllc/logger';
const logger = new Logger({ level: 'debug' });
const parser = new MarkdownParser({ logger });
await parser.parse('# Test');
// Logs: "Markdown parsed successfully" with metadataAPI Reference
MarkdownParser
Main parser class for converting markdown to AST.
Constructor
new MarkdownParser(config?: MarkdownParserConfig)Config Options:
gfm?: boolean- Enable GitHub-Flavored Markdown (default:true)commonmark?: boolean- Enable CommonMark strict mode (default:true)plugins?: MarkdownPlugin[]- Custom plugins to apply (default:[])cache?: CacheConfig- Caching configurationenabled: boolean- Enable cachingmaxSize?: number- Maximum cache entries (default:1000)ttl?: number- Time to live in seconds (default:3600)
logger?: Logger- Logger instance for debug output
Methods
parse(markdown: string): Promise<ParseResult>
Parse markdown string to AST.
Returns:
{
success: boolean;
ast?: ASTNode;
error?: string;
metadata?: {
parseTime: number; // Milliseconds
nodeCount: number; // Total nodes in AST
cached: boolean; // Whether result was cached
};
}clearCache(): void
Clear all cached parse results.
getCacheStats(): { size: number; enabled: boolean }
Get current cache statistics.
pruneCache(): number
Remove expired cache entries. Returns number of entries removed.
AST Node Types
The parser generates nodes with the following structure:
interface ASTNode {
type: NodeType;
value?: string;
children?: ASTNode[];
attributes?: Record<string, unknown>;
position?: Position;
}Supported Node Types:
- Block:
root,heading,paragraph,blockquote,list,listItem,codeBlock,table,tableRow,tableCell,horizontalRule - Inline:
text,emphasis,strong,code,link,image,lineBreak - GFM:
strikethrough,taskList,taskListItem,autolink
Plugin Interface
interface MarkdownPlugin {
name: string;
version: string;
transform: (ast: ASTNode) => ASTNode;
}Utility Functions
countNodes(node: ASTNode): number
Count total nodes in AST tree.
visitNodes(node: ASTNode, visitor: (node: ASTNode) => void): void
Visit each node in AST tree with callback.
transformNodes(node: ASTNode, transformer: (node: ASTNode) => ASTNode): ASTNode
Transform AST nodes recursively.
findNodes(node: ASTNode, predicate: (node: ASTNode) => boolean): ASTNode[]
Find all nodes matching predicate.
cloneNode(node: ASTNode): ASTNode
Create deep copy of AST node.
Examples
Example 1: Parsing Headers
const parser = new MarkdownParser();
const result = await parser.parse(`
# Heading 1
## Heading 2
### Heading 3
`);
// Find all headings
const headings = findNodes(result.ast!, (node) => node.type === 'heading');
console.log('Found', headings.length, 'headings');Example 2: Extract All Links
const parser = new MarkdownParser();
const result = await parser.parse('Check [Google](https://google.com) and [GitHub](https://github.com)');
const links = findNodes(result.ast!, (node) => node.type === 'link');
links.forEach((link) => {
console.log('Link:', link.attributes.href);
});Example 3: Count Code Blocks
const parser = new MarkdownParser();
const result = await parser.parse(`
\`\`\`javascript
const x = 1;
\`\`\`
\`\`\`python
print("hello")
\`\`\`
`);
const codeBlocks = findNodes(result.ast!, (node) => node.type === 'codeBlock');
console.log('Found', codeBlocks.length, 'code blocks');Example 4: Performance Monitoring
const parser = new MarkdownParser({
cache: { enabled: true },
});
const markdown = '# Heading\n\n' + 'Paragraph text. '.repeat(1000);
// First parse
const result1 = await parser.parse(markdown);
console.log('First parse:', result1.metadata.parseTime, 'ms');
// Cached parse
const result2 = await parser.parse(markdown);
console.log('Cached parse:', result2.metadata.parseTime, 'ms');
console.log('Speedup:', result1.metadata.parseTime / result2.metadata.parseTime, 'x');Integration Status
Logger Integration
Status: Integrated
Justification: This package uses @bernierllc/logger for debug output during markdown parsing operations. Logging includes parse timing, cache hits/misses, and error details to help with debugging and performance monitoring.
Pattern: Direct integration - logger is a required dependency for this package.
NeverHub Integration
Status: Not applicable
Justification: This is a core utility package that performs markdown parsing. It does not participate in service discovery, event publishing, or service mesh operations. Markdown parsing is a stateless utility operation that doesn't require service registration or discovery.
Pattern: Core utility - no service mesh integration needed.
Docs-Suite Integration
Status: Ready
Format: TypeDoc-compatible JSDoc comments are included throughout the source code. All public APIs are documented with examples and type information.
Performance
- Large documents: Parses 1000 headings in <1 second
- Caching: 10-100x speedup for repeated parsing
- Memory efficient: Configurable cache size limits
- Non-blocking: Async API for integration with event loops
Related Packages
- @bernierllc/markdown-renderer - Renders AST to HTML
- @bernierllc/logger - Structured logging
- @bernierllc/crypto-utils - Hash generation for cache keys
License
Copyright (c) 2025 Bernier LLC
This file is licensed to the client under a limited-use license. The client may use and modify this code only within the scope of the project it was delivered for. Redistribution or use in other products or commercial offerings is not permitted without written consent from Bernier LLC.
