npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

tree-sitter-markdown-text

v0.2.1

Published

Markdown grammar for tree-sitter, with a textlint-style AST shape

Readme

tree-sitter-markdown-text

Markdown grammar for tree-sitter, shaped so that its AST lines up with the textlint TxtNode model.

Parses .md (and .markdown, .mdown, .mkd, .mkdn) files into a concrete syntax tree covering the full CommonMark block structure plus common extensions (GFM pipe tables, task lists, GFM alerts, YAML/TOML front matter, Pandoc math and directive blocks, footnotes, MDX JSX). Inline content is surfaced as structured children of the inline wrapper: classified tokens (word_token, numeric_token, identifier_like_token, path_like_token) and punctuation-class nodes (terminator, separator, bracket, operator_like), plus inline structural nodes (emphasis, strong, strikethrough, link, image, autolink, inline_code, html_inline, math_inline, mdx_jsx_inline, footnote_reference).

Features

Block nodes

  • Document structuredocument, nested section wrappers around ATX headings, paragraph, blank_line (as a first-class node).
  • Headings — ATX (#..######) and setext (===/---) with the heading level exposed as a level field on both atx_heading and setext_heading.
  • Code blocks — indented code blocks and fenced code blocks (backtick and tilde), with info_string/language children for the GFM language tag.
  • Math blocks — Pandoc/GitLab/KaTeX display math ($$…$$) as a dedicated math_block with math_block_delimiter/math_block_content children.
  • Lists — unordered (+/-/*) and ordered (1./1)) list markers. GFM task list items are promoted to task_list_item (distinct from list_item), with task_list_marker_checked/task_list_marker_unchecked markers.
  • Block quotes and callouts — nested quotes and lazy continuations. A block quote whose first paragraph begins with [!NOTE] / [!TIP] / [!IMPORTANT] / [!WARNING] / [!CAUTION] (or any uppercase-only label) is surfaced as callout with a callout_type field.
  • Thematic breaks---, ***, ___.
  • HTML blocks — all 7 CommonMark HTML block types; block-level HTML comments are aliased to html_comment_block for easy metric extraction.
  • MDX JSX blocks — shallow mdx_jsx_block for lines that start with an MDX-style JSX element (<Component ...>, <Component/>, </Component>). Component-style mixed-case names disambiguate from all-caps HTML blocks such as <DIV>.
  • Pipe tablespipe_table with pipe_table_header, pipe_table_delimiter_row, pipe_table_row, pipe_table_cell, pipe_table_align_left/pipe_table_align_right.
  • Link reference definitionslink_reference_definition with link_label/link_destination/link_title children.
  • Footnote definitionsfootnote_definition ([^id]: …) with a footnote_label child.
  • Directive blocks — generic container directives (:::name … :::, per remark-directive / MyST / Pandoc fenced divs) as directive_block with directive_block_delimiter/directive_name/directive_block_content children.
  • Image blocks — a paragraph consisting of a single block-level image (![alt](dest) on its own line) is surfaced as image_block with link_label/link_destination children.
  • Front matter — YAML (--- fenced) as minus_metadata, TOML (+++ fenced) as plus_metadata.

Inline nodes (children of the inline wrapper)

  • Classified text tokenstext_span wraps runs of classified tokens: word_token (Unicode alphabetic), numeric_token (integers, decimals, versions), identifier_like_token (camelCase / PascalCase / snake_case), path_like_token (paths with / separators or dotted identifiers).

  • Punctuation classes — every punctuation lexeme is classified: terminator (., ?, !, , ), separator (,, ;, :), bracket ((, ), [, ], {, }, <, >), operator_like (::, ->, =>, =, +, -, *, /, |, &, and other punctuation).

  • Emphasis / strong / strikethroughemphasis (*…* or _…_), strong (**…** or __…__), strikethrough (~~…~~), each with a _delimiter/_content/_delimiter sub-tree.

  • Code spansinline_code with matched backtick-run delimiters (1 or 2 backticks).

  • Links and imageslink (inline, full-reference, collapsed-reference, shortcut-reference forms) and image (![alt](dest) or ![alt][ref]). Both expose link_label/link_destination/link_title children.

  • Autolinksautolink with uri or email children for <https://…> and <[email protected]>.

  • Raw HTML inlinehtml_inline with html_open_tag/html_close_tag/html_comment/html_cdata/html_declaration/html_processing_instruction children.

  • MDX JSX inline — shallow mdx_jsx_inline with mdx_jsx_open_tag/mdx_jsx_close_tag/mdx_jsx_expression children.

  • Inline mathmath_inline ($…$) with math_inline_delimiter/math_inline_content children. Disambiguated from math_block ($$…$$).

  • Footnote referencesfootnote_reference ([^id] inside prose) with a footnote_reference_label child.

  • Injections query — ships a queries/injections.scm that injects into fenced-code-block info strings, HTML blocks, and front matter.

Example

# Heading

A paragraph with inline content.

- one
- two

```go
func main() {}

Parsed tree (abbreviated):

(document (section (atx_heading level: (atx_h1_marker) heading_content: (inline)) (blank_line) (paragraph (inline)) (blank_line) (list (list_item (list_marker_minus) (paragraph (inline))) (list_item (list_marker_minus) (paragraph (inline)))) (blank_line) (fenced_code_block (fenced_code_block_delimiter) (info_string (language)) (code_fence_content) (fenced_code_block_delimiter))))


## Relationship to textlint

The grammar is structurally close to the textlint AST. Every block-level `TxtNode` type has a direct counterpart here; inline `TxtNode` types (`Str`, `Emphasis`, `Strong`, `Link`, `Image`, `Code`, `Html`, `Delete`, `FootnoteReference`) also have direct counterparts as children of the `inline` wrapper. Names stay snake_case per the tree-sitter convention; consumers map names themselves. See [docs/textlint-mapping.md](docs/textlint-mapping.md) for the full table.

## Installation

### npm

```sh
npm install tree-sitter-markdown-text

Cargo

cargo add tree-sitter-markdown-text

PyPI

pip install tree-sitter-markdown-text

Go

import tree_sitter_markdown_text "github.com/ophidiarium/tree-sitter-markdown-text/bindings/go"

The root package also exports the bundled queries via go:embed:

import markdown "github.com/ophidiarium/tree-sitter-markdown-text"

lang := markdown.GetLanguage()
query, _ := markdown.GetHighlightsQuery()

Usage

Node.js

import Parser from "tree-sitter";
import Markdown from "tree-sitter-markdown-text";

const parser = new Parser();
parser.setLanguage(Markdown);

const tree = parser.parse("# hello\n");
console.log(tree.rootNode.toString());

Rust

let mut parser = tree_sitter::Parser::new();
let language = tree_sitter_markdown_text::LANGUAGE;
parser.set_language(&language.into()).unwrap();

let tree = parser.parse("# hello\n", None).unwrap();
println!("{}", tree.root_node().to_sexp());

Python

from tree_sitter import Language, Parser
import tree_sitter_markdown_text

parser = Parser(Language(tree_sitter_markdown_text.language()))
tree = parser.parse(b"# hello\n")
print(tree.root_node.sexp())

Credits and references

License

MIT