@snapp-notes/markdown-parser
v0.1.2
Published
Simple Markdown Parser that return AST
Readme
@snapp-notes/markdown-parser
Simple Markdown Parser that returns an Abstract Syntax Tree (AST) with location information.
Installation
npm install @snapp-notes/markdown-parserFeatures
- 📝 Parse markdown into a structured AST
- 📍 Location tracking for every node
- 🎯 Support for common markdown elements:
- Headers (H1-H6)
- Code blocks with language specification
- Bold text (
**and__) - Italic text (
*and_) - Inline links
- List items
- Plain text
- 🚀 Built with PEG.js/Peggy for reliable parsing
- 📦 ES Module support
- 💪 TypeScript definitions included
Usage
Basic Example
import { parse } from '@snapp-notes/markdown-parser';
const markdown = '# Hello World\nThis is **bold** text.';
const ast = parse(markdown);
console.log(ast);Output:
[
{
type: 'header',
content: '# Hello World',
level: 1,
loc: { start: { offset: 0, line: 1, column: 1 }, end: { ... } }
},
{
type: 'text',
content: '\n',
loc: { ... }
},
{
type: 'text',
content: 'This is '
},
{
type: 'bold',
content: '**bold**',
loc: { ... }
},
{
type: 'text',
content: ' text.'
}
]Parsing Headers
import { parse } from '@snapp-notes/markdown-parser';
const ast = parse('# H1\n## H2\n### H3');
// Each header node contains:
// - type: 'header'
// - content: full header text including # symbols
// - level: number (1-6)
// - loc: location informationParsing Code Blocks
import { parse } from '@snapp-notes/markdown-parser';
const markdown = `\`\`\`javascript
const greeting = "Hello";
console.log(greeting);
\`\`\``;
const ast = parse(markdown);
// Code node contains:
// - type: 'code'
// - content: code content (includes leading newline)
// - language: 'javascript' (or empty string if not specified)
// - loc: location informationParsing Inline Formatting
import { parse } from '@snapp-notes/markdown-parser';
// Bold text
parse('**bold text**'); // or '__bold text__'
// Italic text
parse('*italic text*'); // or '_italic text_'
// Mixed formatting
const ast = parse('This is **bold** and *italic* text');Parsing Links
import { parse } from '@snapp-notes/markdown-parser';
const ast = parse('[Google](https://google.com)');
// Link node contains:
// - type: 'link'
// - text: 'Google'
// - url: 'https://google.com'
// - content: '[Google](https://google.com)'
// - loc: location informationParsing Lists
import { parse } from '@snapp-notes/markdown-parser';
const markdown = `* Item 1
* Item 2
* Item 3`;
const ast = parse(markdown);
// List nodes contain:
// - type: 'list'
// - content: '* Item text'
// - loc: location informationComplex Document
import { parse } from '@snapp-notes/markdown-parser';
const markdown = `# My Document
This is a paragraph with **bold** and *italic* text.
Visit [my website](https://example.com) for more info.
\`\`\`python
def hello():
print("Hello, World!")
\`\`\`
* Feature 1
* Feature 2
`;
const ast = parse(markdown);
// The AST will contain a mix of different node types
ast.forEach(node => {
console.log(`${node.type}: ${node.content?.substring(0, 30)}...`);
});API
parse(input: string, options?: { startRule?: string }): MarkdownNode[]
Parses a markdown string and returns an array of AST nodes.
Parameters:
input(string): The markdown text to parseoptions(optional): Parser optionsstartRule(optional): The grammar rule to start parsing from (default: 'start')
Returns: An array of MarkdownNode objects
Throws: SyntaxError if the input cannot be parsed
Node Types
TextNode
interface TextNode {
type: 'text' | 'bold' | 'italic' | 'list';
content: string;
loc: Location;
}Used for plain text, bold text, italic text, and list items.
HeaderNode
interface HeaderNode {
type: 'header';
content: string;
level: number; // 1-6
loc: Location;
}CodeNode
interface CodeNode {
type: 'code';
content: string;
language?: string;
loc: Location;
}Note: The content includes a leading newline character.
LinkNode
interface LinkNode {
type: 'link';
text: string;
url: string;
content: string;
loc: Location;
}Location
interface Location {
start: Position;
end: Position;
}
interface Position {
offset: number; // Character offset from start
line: number; // Line number (1-based)
column: number; // Column number (1-based)
}Supported Markdown Syntax
| Element | Syntax | Example |
|---------|--------|---------|
| Header | # to ###### | # Title |
| Bold | **text** or __text__ | **bold** |
| Italic | *text* or _text_ | *italic* |
| Link | [text](url) | [Google](https://google.com) |
| Code Block | ```lang\ncode\n``` | ```js\ncode\n``` |
| List Item | * item | * Item 1 |
Limitations
- Nested formatting (e.g., bold within italic) is not fully supported
- Only unordered lists with
*are supported - No support for:
- Blockquotes
- Tables
- Images
- Horizontal rules
- Strikethrough
- Task lists
Development
Build
Generate the parser from the grammar file:
npm run buildTesting
Run the test suite:
npm testWatch mode for development:
npm run test:watchGrammar
The parser is built using Peggy (formerly PEG.js). The grammar file is located at src/grammar.peggy.
To modify the parser, edit the grammar file and rebuild:
npm run buildContributing
Contributions are welcome! Please ensure all tests pass before submitting a pull request.
npm run build
npm testLicense
Copyright (c) 2025 Jakub T. Jankiewicz
Released under MIT license
