npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

marklit

v0.2.1

Published

Modern markdown parser in Typescript

Downloads

56

Readme

Marklit modern markdown parser in TypeScript

License: MIT npm version npm downloads Build Status

WARNING: Ready for use with exceptions (missing HTML parsing rules)

Originally this project is deeply re-engineered fork of marked with conceptual differences.

Design goals

  • Deep customizability
  • Compile-time configuration
  • Compact code size

Key features

  • Parsing result is abstract document tree what allows advanced intermediate processing and non-string rendering
  • Extensible architecture which allows adding new parsing rules and document tree element types
  • Strictly typed design which allows to use full power of typescript to avoid runtime errors
  • Progressive parser core implementation what gives maximum possible parsing speed

HTML support

The HTML doesn't supported at the moment, but it will be added in future.

Usage tips

Basic setup

Basically you need to do several things to get type-safe markdown parser and renderer.

Define types

First, you need define some types:

  • Meta-data type
  • Block token types map
  • Inline token types map
  • Context mapping type

See example below:

import {
  MetaData,
  InlineTokenMap,
  BlockTokenMap,
  ContextMap,
} from 'marklit';

interface InlineToken extends InlineTokenMap<InlineToken> { }

interface BlockToken extends BlockTokenMap<BlockToken, InlineToken> { }

interface Context extends ContextMap<BlockToken, InlineToken, MetaData> { }

Init parser

Next, you can initialize parser:

import {
  BlockNormal,
  InlineNormal,
  init,
  parse
} from 'marklit';

// initialize parser using normal parsing rules
const parser = init<Context>(...BlockNormal, ...InlineNormal);

// parse markdown to get abstract document tree
const adt = parse(parser, "markdown source string");

...and renderer:

import {
  BlockHtml,
  InlineHtml,
  initRenderHtml,
  render
} from 'marklit';

// initialize renderer using basic HTML render rules
const renderer = initRenderHtml<Context>(...BlockHtml, ...InlineHtml);

// render abstract document tree to get HTML string
const html = render(renderer, adt);

All together

The example below shows complete configuration:

import {
  MetaData,
  InlineTokenMap,
  BlockTokenMap,
  ContextMap,
  
  BlockNormal,
  InlineNormal,
  init,
  parse,
  
  BlockHtml,
  InlineHtml,
  initRenderHtml,
  render
} from 'marklit';

interface InlineToken extends InlineTokenMap<InlineToken> { }

interface BlockToken extends BlockTokenMap<BlockToken, InlineToken> { }

interface Context extends ContextMap<BlockToken, InlineToken, MetaData> { }

// initialize parser using normal parsing rules
const parser = init<Context>(...BlockNormal, ...InlineNormal);

// initialize renderer using basic HTML render rules
const renderer = initRenderHtml<Context>(...BlockHtml, ...InlineHtml);

// parse markdown to get abstract document tree
const adt = parse(parser, "markdown source string");

// render abstract document tree to get HTML string
const html = render(renderer, adt);

Github-flavored markdown

The next example shows configuration which uses GFM rules instead of normal:

import {
  MetaData,
  InlineTokenMap,
  BlockTokenMap,
  ContextMap,
  
  BlockGfm,
  InlineGfm,
  init,
  parse,
  
  BlockTablesHtml,
  InlineGfmHtml,
  initRenderHtml,
  render
} from 'marklit';

interface InlineToken extends InlineTokenMap<InlineToken> { }

interface BlockToken extends BlockTokenMap<BlockToken, InlineToken> { }

interface Context extends ContextMap<BlockToken, InlineToken, MetaData> { }

// initialize parser using normal parsing rules
const parser = init<Context>(...BlockGfm, ...InlineGfm);

// initialize renderer using basic HTML render rules
const renderer = initRenderHtml<Context>(...BlockTablesHtml, ...InlineGfmHtml);

// parse markdown to get abstract document tree
const adt = parse(parser, "markdown source string");

// render abstract document tree to get HTML string
const html = render(renderer, adt);

Using extensions

The programming design of marklit allows to modify the behavior of any rules in order to extend parser and renderer.

An extensions includes rules and rule modifiers which allows deep customization.

Writing rulesets

You can override existing rules in rulesets like BlockNormal, InlineNormal, BlockGfm by appending modified rules. Or you can create your own rulesets using existing or new rules.

The topics below shows how to customize behavior by using extensions:

GFM breaks

You can extend normal text rule to GFM or GFM with breaks:

import {
  TextSpan,
  gfmText,
  gfmBreaks,
} from 'marklit';

const GfmTextSpan = gfmText(TextSpan);
const GfmBreaksTextSpan = gfmBreaks(gfmText(TextSpan));

Or simply use existing GFM text rules:

import {
  GfmTextSpan,
  GfmBreaksTextSpan
} from 'marklit';

SmartyPants

You can add smartypants support to any text rule like this:

import {
  BlockNormal,
  InlineNormal,
  
  BlockGfmTables,
  InlineGfm,

  TextSpan,
  GfmTextSpan,
  smartypantsText,
  
  init
} from 'marklit';

const SmartypantsTextSpan = smartypantsText(TextSpan);
const SmartypantsGfmTextSpan = smartypantsText(GfmTextSpan);

// Custom SmartypantsTextSpan rule overrides default TextSpan rule which comes from InlineNormal ruleset
const parser = init<Context>(...BlockNormal, ...InlineNormal, ...SmartypantsTextSpan);

// Custom SmartypantsGfmTextSpan rule overrides default GfmTextSpan rule which comes from InlineGfm ruleset
const parser = init<Context>(...BlockGfmTables, ...InlineGfm, ...SmartypantsGfmTextSpan);

Math extension

The mathematic extension includes two rules:

  1. Inline math enclosed by $ signs like an inline code (MathSpan)
  2. Block math enclosed by series of $ signs like a fenced code blocks (MathBlock)

You can use one of this rules or both together.

import {
  InlineTokenMap,
  BlockTokenMap,
  ContextMap,
  
  BlockMath,
  InlineMath,
  
  BlockNormal,
  InlineNormal,
  
  MathBlock,
  MathSpan,
  
  init,
  parse,
  
  BlockHtml,
  InlineHtml,
  MathBlockHtml,
  MathSpanHtml,
  
  initHtmlRender,
  render
} from 'marklit';

// inline token with math
interface InlineToken extends InlineTokenMap<InlineToken>, InlineMath { }

// block token with math
interface BlockToken extends BlockTokenMap<BlockToken, InlineToken>, BlockMath { }

// inline context with math
interface InlineContext extends ContextMap<BlockToken, InlineToken, NoMeta> { }

// append math rules to normal rules
const parser = init<Context>(...BlockNormal, MathBlock, ...InlineNormal, MathSpan);

// append math rules to normal rules
const renderer = initRenderHtml<Context>(...BlockHtml, MathBlockHtml, InlineHtml, MathSpanHtml);

const adt = parse(parser, `
Inline formula $E = mc^2$

Block equation:

$$$.dot
graph {
    a -- b;
    b -- c;
    a -- c;
}
$$$
`);

const html = render(renderer, adt);

Abbreviations

The abbrevs extension consists of three parts:

  1. Block rule (AbbrevBlock)
  2. Text rule modifier (abbrevText)
  3. Inline rule (Abbrev)

Usually you need first two rules to get automatic abbreviations. The third rule adds extra forced abbreviations into inline context.

import {
  InlineTokenMap,
  BlockTokenMap,
  ContextMap,
  
  InlineAbbrev,
  
  BlockNormal,
  InlineNormal,
  AbbrevBlock,
  Abbrev,
  TextSpan,
  abbrevText,
  
  init,
  parse,
  
  BlockHtml,
  InlineHtml,
  AbbrevHtml,
  
  initHtmlRender,
  render
} from 'marklit';

// inline token with abbrev
interface InlineToken extends InlineTokenMap<InlineToken>, InlineAbbrev { }

// normal block token
interface BlockToken extends BlockTokenMap<BlockToken, InlineToken> { }

// inline context with abbrev
interface InlineContext extends ContextMap<BlockToken, InlineToken, NoMeta> { }

// append abbrev rules to normal rules
const parser = init<Context>(...BlockNormal, AbbrevBlock, ...InlineNormal, Abbrev, abbrevText(TextSpan));

// append abbrev rules to normal rules
const renderer = initRenderHtml<Context>(...BlockHtml, InlineHtml, AbbrevHtml);

const adt = parse(parser, `The HTML specification
is maintained by the W3C.

*[HTML]: Hyper Text Markup Language
*[W3C]:  World Wide Web Consortium
`);

const html = render(renderer, adt);

Footnotes

The footnotes extension includes two rules:

  1. Inline footnote reference rule (Footnote)
  2. Block footnotes block rule (FootnotesBlock)

You need use both rules to get working footnotes:

import {
  InlineTokenMap,
  BlockTokenMap,
  ContextMap,
  
  InlineFootnote,
  BlockFootnotes,
  
  BlockNormal,
  InlineNormal,
  FootnotesBlock,
  Footnote,
  
  init,
  parse,
  
  BlockHtml,
  InlineHtml,
  FootnoteHtml,
  FootnotesBlockHtml,
  
  initHtmlRender,
  render
} from 'marklit';

// inline token with footnote refs
interface InlineToken extends InlineTokenMap<InlineToken>, InlineFootnote { }

// block token with footnotes list
interface BlockToken extends BlockTokenMap<BlockToken, InlineToken>, BlockFootnotes { }

// inline context with footnotes
interface InlineContext extends ContextMap<BlockToken, InlineToken, NoMeta> { }

// append footnote rules to normal rules
const parser = init<Context>(...BlockNormal, FootnotesBlock, ...InlineNormal, Footnote);

// append footnote rules to normal rules
const renderer = initRenderHtml<Context>(...BlockHtml, FootnotesBlockHtml, InlineHtml, FootnoteHtml);

const adt = parse(parser, `Footnotes[^1] have a label[^@#$%] and the footnote's content.

[^1]: This is a footnote content.
[^@#$%]: A footnote on the label: "@#$%".`);

const html = render(renderer, adt);

Inline footnotes

TODO:

Table of contents

TODO:

Basic ideas

Abstract document tree

Traditionally markdown parsers generates HTML as result. This is simple but not so useful in most advanced usecases. By example, when you need intermediate processing or direct rendering to DOM tree, the ADT is much more conveniently.

The marklit ADT is a JSON tree of block and inline elements called tokens. Each token is a simple object with $ field as tag, optional _ field with list of sub-tokens, and optionally several other token type specific fields which called properties.

TODO: Document tree examples.

Extensibility

The architecture of marked does not allows you to add new rules. You may only modify regexps of existing rules and write your own string-based renderer.

The marked-ast partially solves the problem of renderer but still doesn't allows add new rules.

The simple-markdown from Khan academy have good extensibility but it is not so fast parser as marked.

Because one of important goal of this project is parsing speed it required solution, which gives extensibility without worsening of speed.

Type-safety

As conceived the abstract document tree must be is strictly typed itself. But because TypeScript doesn't yet support circular type referencing, the token type infering cannot be implemented now. So you need a little bit of handwork with types here.

Speedup parsing

The marked iterates over matching regexps for each rules until first match occured. It's not so fast as it can be because JS engine does multiple matching for multiple regexps.

The marklit constructs single regexp using all rules to do matching for all rules at once. This technique moves workload from JS side to embedded RegExp engine.

Benchmarking

Because the operation flow of marklit includes ADT stage it is too differs from other md-to-html parsers so the benchmarking won't give comparable results.