remark-harden-urls
v0.0.1
Published
🛡️ remark plugin to sanitize and harden all URLs in Markdown — supports pattern-based allow/block lists and safe defaults.
Maintainers
Keywords
Readme
remark-harden-urls 🔗🔒
🔒 A remark plugin to enforce strict security policies on URLs within Markdown links, images, and definitions using the powerful harden-urls library.
It removes, normalizes, or blocks unsafe protocols, malformed URLs, and known tracking parameters—without altering valid Markdown content or deleting entire nodes silently.
✨ Features & Philosophy
- Comprehensive URL Hardening: Protects all primary Markdown URL sources: inline links (
[text](url)), inline images (), and reference/footnote definitions ([id]: url). - Superior URL Cleaning: Goes beyond basic protocol checks by stripping tracking parameters (
utm_,fbclid), applying Unicode normalization (NFKC), and cleaning zero-width/control characters. - Granular Control: Configure different policies for
link,image, anddefinitionelements with easy-to-use presets and shorthands. - Safer Defaults: By default, it uses a placeholder (removes
urland adds[blocked]text/alt) instead of silent removal, ensuring no unexpected loss of content. - Ecosystem Compatible: Designed to work seamlessly within the UnifiedJS ecosystem alongside
remark-parse,remark-gfm, andremark-rehype.
📦 Installation
We recommend using utilities such as
toRegexps,domainsToRegexps, etc. fromharden-urlspackage for creating your custom configurations.
pnpm add remark-harden-urls harden-urlsor
npm install remark-harden-urls harden-urls🚀 Usage & Best Practices
For maximum safety when dealing with untrusted Markdown, your plugin chain should prioritize deep URL cleaning at the remark (Markdown AST) stage.
1. Basic Usage with Presets
Use the built-in presets for quick, robust security policies. Note how it processes inline links, images, and definitions:
import remark from "remark";
import remarkParse from "remark-parse"; // Or remark-gfm for GitHub Flavored Markdown
import { remarkHardenUrls } from "remark-harden-urls";
import { presets } from "harden-urls/utils"; // Get pre-defined policies
const markdown = `
Check out [my secure site][site] and 
[site]: javascript:alert('XSS!')
`;
remark()
.use(remarkParse) // 1. Parse Markdown into MDast
.use(remarkHardenUrls, {
link: presets.balanced, // e.g., allow https:, mailto: for inline links
image: presets.strict, // e.g., only allow https: for inline images
definition: { prune: true, allowedProtocols: new Set(["https:"]) }, // Very strict for definitions
})
.processSync(markdown);
// Output will have 'javascript:alert('XSS!')' definition pruned,
// and '[http://malicious.com/tracker.jpg?utm_source=spam](http://malicious.com/tracker.jpg?utm_source=spam)' image URL stripped of params and potentially blocked/replaced.2. Shorthand Configuration
To quickly define a simple allowlist, you can pass an array of strings or RegExp objects.
| Option Shorthand | Description |
| :------------------------------------ | :-------------------------------------------------------------------------------- |
| link: false | Skip sanitizing inline Markdown links completely. |
| link: ['#', 'mailto:', 'tel:'] | Only allows inline link urls starting with #, mailto:, or tel:. |
| image: ['*.mysite.com', 'assets/*'] | Only allows inline images from your subdomains or relative paths under assets/. |
| definition: { prune: true } | Removes any definition with an unsafe URL. |
remarkHardenUrls({
// Only allow external links to example.com and github.com
link: ["https://*.example.com", "[https://github.com](https://github.com)"],
// Disable image hardening (if handled elsewhere or Markdown images are not used)
image: false,
// Make definitions very strict: only internal #anchors
definition: { allowedProtocols: new Set(["#"]) },
});⚙️ Configuration
The plugin supports a top-level shared configuration that can be individually overridden for link, image, and definition elements.
| Option | Type | Default | Description |
| :----------------- | :----------------------------- | :------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| prune | boolean | false | Global: If true, removes the element entirely on block. If false, uses a placeholder (safer default). |
| link, image | Config \| Shorthand \| false | {} | Per-element policy. Overrides all shared options (prune, allowedProtocols, etc.) for inline links/images. |
| definition | Config \| Shorthand \| false | {} | Per-element policy for Markdown definitions. Overrides shared options. Note: Definitions do not render [blocked] text, only their url is modified or the node is pruned. |
| onUnsafeUrl | (url, node, type) => void | null | Hook to log/audit when a URL is blocked or altered. type can be link, image, or definition. |
| emptyUrlValue | string | "#" for links/definitions, "" for images | The value to replace an unsafe URL with when prune is false. |
| allowedProtocols | Set<string> | Default | Allowed schemes (https:, mailto:, etc.). |
| stripParams | string[] | Default | Query parameters (or prefixes like utm_) to remove from the URL. |
Granular Pruning: Use the prune property on link, image, or definition to override the global setting. For example, link: { prune: true } will delete unsafe inline links, while image: { prune: false } might keep image placeholders.
Gotcha: Handling Placeholders
When a URL is blocked and not pruned (prune: false):
link(inline): Theurlattribute is replaced byemptyUrlValue(default#), and the text[blocked]is appended to the link's content.image(inline): Theurlattribute is replaced byemptyUrlValue(default""), and[blocked]is appended to thealtattribute.definition: Theurlattribute is replaced byemptyUrlValue(default#). As definitions are metadata, no[blocked]text is added to the definition itself; the visual effect will be a broken or non-functional reference link/image.
⚖️ Integration with remark-rehype and rehype Plugins
remark-harden-urls focuses only on the Markdown AST. If your Markdown contains embedded HTML (e.g., <a href="..."> or <img src="...">), remark-harden-urls will not process these.
For complete end-to-end security when dealing with Markdown containing embedded HTML, you must:
- Use
remark-gfm(or similar) to parse embedded HTML into a generichtmlnode. - Use
remark-rehypeto convert the MDast to a HAST (HTML AST). - Then, use
rehype-harden-urlsandrehype-sanitizeon the HAST.
Recommended secure chain for Markdown with embedded HTML:
import remark from "remark";
import remarkGfm from "remark-gfm"; // For embedded HTML (and other GFM features)
import remarkRehype from "remark-rehype";
import rehypeHardenUrls from "rehype-harden-urls"; // For embedded HTML links/images
import rehypeSanitize from "rehype-sanitize"; // For final HTML tag/attribute validation
import rehypeStringify from "rehype-stringify";
remark()
.use(remarkGfm) // 1. Parse Markdown, including embedded HTML as 'html' nodes
.use(remarkHardenUrls) // 2. Harden URLs in Markdown links/images/definitions
.use(remarkRehype) // 3. Convert MDast to HAST
.use(rehypeHardenUrls) // 4. Harden URLs in embedded HTML <a>, <img>
.use(rehypeSanitize) // 5. Final HTML structure/attribute sanitization
.use(rehypeStringify) // 6. Convert HAST to HTML string
.processSync(/* markdown */);🤝 Contribution & Support
We enthusiastically welcome contributions from the community!
Whether you are reporting a bug, suggesting a new feature, or submitting a pull request, your help makes this a safer tool for everyone. Please check the GitHub Issues for open tasks.
💖 Adopt and Support: If this package helps secure your application, consider giving us a star on GitHub! You can also sponsor our work to help fund continued development and maintenance.
🪷 License
This project is licensed under the MIT License.
MIT © Mayank Chaudhari
