@atlaskit/editor-bitbucket-transformer

v9.6.4

Published

a month ago

Editor Bitbucket transformer

Downloads

3,235

0High
0Medium
0Low

atlassianartifactteam

@atlaskit/editor-bitbucket-transformer

Editor Bitbucket transformer for converting between ADF, Markdown, and HTML formats.

Description

This package provides transformation utilities specifically designed for Bitbucket's editor integration. It handles the complex pipeline of converting between ADF (Atlassian Document Format), Markdown, and HTML, with special focus on caption escaping and attribute preservation through the transformation process.

Key Features

Transformers

Serializer: Main serialization utilities for ADF to Markdown conversion
Table Serializer: Specialized table transformation handling
Utility Functions: Common transformation utilities

Caption Escaping Pipeline

HTML Attribute Escaping: Safely escapes captions for Markdown storage
Unified Escaping Strategy: Handles both HTML meta characters and Markdown punctuation
Round-trip Safety: Ensures data integrity through ADF → Markdown → HTML → ADF pipeline

Examples

The package includes comprehensive examples in the examples/ directory:

Basic transformer example
Bitbucket HTML handling
Bitbucket Markdown processing
Helper utilities and styling

Team

Editor: Collaboration

Caption escaping and the Markdown/HTML/ADF pipeline

This package serializes ADF to Markdown for storage, and later reconstructs ADF from HTML that is rendered by the backend using python-markdown. Image captions are stored in Markdown using python-markdown’s attr_list syntax on the image:

![](http://path/to/image.jpg){: data-layout='center' data-caption='...'}

Why escaping captions is tricky

python-markdown parses Markdown first, then applies attr_list to bind attributes to elements. If caption text inside data-caption contains Markdown markers (e.g. **, __, `, ~~), those markers can be interpreted during Markdown parsing.
When Markdown markers are interpreted within the attribute text, the attribute list can be malformed or ejected, resulting in the attributes not being set on the element (e.g. losing data-caption).

Unified attribute escaping

To ensure captions survive the Markdown → HTML transformation safely, we use a unified escaping strategy:

escapeHtmlAttribute(text): Escapes both HTML attribute meta characters (&, <, >, ", ') and Markdown/attr_list-sensitive punctuation by converting them into numeric entities:
- * → *, _ → _, ` → `, ~ → ~
- | → |, { → {, } → }
- [ → [, ] → ], ( → (, ) → ), ! → !
This prevents python-markdown from interpreting the caption content as Markdown or as part of an attr_list, preserving the attribute safely.
unescapeHtmlAttribute(text): Decodes the same superset of entities when reading back from HTML into ADF. Decoding is order-sensitive—& is decoded last to prevent double-unescape issues. After unescaping, we sanitize and parse a limited subset of Markdown formatting for display (**, _, ~~, and `) into safe HTML tags we control.

End-to-end flow

ADF → Markdown (bbc-frontbucket - using this package):
- Captions are serialized into data-caption using escapeHtmlAttribute to encode HTML meta characters and Markdown punctuation.
Markdown → HTML (bbc-core):
- python-markdown takes markdown renders HTML and applies attr_list.
- Because punctuation was entity-encoded, the attributes remain intact even if the caption contains Markdown markers or {...}-like text.
HTML → ADF (frontend):
- We read data-caption from the html and call unescapeHtmlAttribute, which decodes both HTML and punctuation entities.
- We then sanitize and parse a safe subset of Markdown to generate ADF caption nodes (e.g., <strong>, <em>, <s>, <code>).

Migration

escapeAttrListValue has been removed. Use escapeHtmlAttribute and unescapeHtmlAttribute.

Security considerations

We always encode < and > in attributes and, when parsing back from HTML, we sanitize caption content before any Markdown formatting is converted to HTML.
Our Markdown parsing is intentionally minimal and maps directly to a small set of safe tags.

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

@atlaskit/editor-bitbucket-transformer

Description

Key Features

Transformers

Caption Escaping Pipeline

Examples

Team

Caption escaping and the Markdown/HTML/ADF pipeline

Why escaping captions is tricky

Unified attribute escaping

End-to-end flow

Migration

Security considerations