rehype-harden-urls
v0.1.1
Published
🛡️ Rehype plugin to sanitize and harden all URLs in Markdown — supports pattern-based allow/block lists and safe defaults.
Maintainers
Keywords
Readme
rehype-harden-urls 🔗🔒
🔒 A rehype plugin to enforce strict security policies on URLs within <a> and <img> elements using the powerful harden-urls library.
It removes, normalizes, or blocks unsafe protocols, malformed URLs, and known tracking parameters—without altering valid content or deleting entire nodes silently.
✨ Features & Philosophy
- Superior URL Hardening: Goes beyond basic protocol checks by stripping tracking parameters (
utm_,fbclid), applying Unicode normalization (NFKC), and cleaning zero-width/control characters. - Granular Control: Configure different policies for
linkandimageelements with easy-to-use presets and shorthands. - Safer Defaults: By default, it uses a placeholder (removes
href/srcand adds[blocked]text) instead of silent removal, ensuring no unexpected loss of content. - Security Best Practice: Automatically adds
rel="noopener noreferrer"to external links to prevent tab-napping attacks. - Ecosystem Compatible: Designed to work perfectly alongside
rehype-rawandrehype-sanitize.
📦 Installation
We recommend using utilities such as
toRegexps,domainsToRegexps, etc. fromharden-urlspackage for creating your custom configurations.
pnpm add rehype-harden-urls harden-urlsor
npm install rehype-harden-urls harden-urls🚀 Usage & Best Practices
Ensure rehype-harden-urls runs after plugins like rehype-raw that can add extra link or image nodes.
For maximum safety when dealing with user-provided HTML, e.g., when using rehype-raw always add a general sanitization plugin like rehype-sanitize.
Note that when dealing with markdown files, remark-rehype keeps raw html as text
1. Basic Usage with Presets
Use the built-in presets for quick, robust security policies:
import rehype from "rehype";
import rehypeRaw from "rehype-raw";
import rehypeSanitize from "rehype-sanitize";
import { rehypeHardenUrls } from "rehype-harden-urls";
import { presets } from "harden-urls/utils"; // Get pre-defined policies
rehype()
.use(rehypeRaw) // 1. Parse raw HTML
.use(rehypeHardenUrls, {
link: presets.balanced, // e.g., allow https:, mailto:
image: presets.strict, // e.g., only allow https: for images
})
.use(rehypeSanitize) // 3. Ensure no unauthorized tags/attributes remain
.process(/* html */);2. Shorthand Configuration (New!)
To quickly define a simple allowlist, you can pass an array of strings or RegExp objects.
| Option Shorthand | Description |
| :------------------------------------ | :------------------------------------------------------------------------- |
| link: false | Skip sanitizing <a href> completely. |
| link: ['#', 'mailto:', 'tel:'] | Only allows href attributes starting with #, mailto:, or tel:. |
| image: ['*.mysite.com', 'assets/*'] | Only allows images from your subdomains or relative paths under assets/. |
rehypeHardenUrls({
// Only allow external links to subdomains of example.com, example.com and github.com
link: ["https://*example.com", "https://github.com"],
// Disable image hardening (if handled elsewhere)
image: false,
});⚙️ Configuration
The plugin supports a top-level shared configuration that can be individually overridden for link and image elements.
| Option | Type | Default | Description |
| ------------------ | --------------------------- | --------- | ------------------------------------------------------------------------------------------------------------- | ---- | ------------------------------------------------------------------------------------- |
| prune | boolean | false | Global: If true, removes the element entirely on block. If false, uses a placeholder (safer default). |
| link, image | Config | Shorthand | false | {} | Per-element policy. Overrides all shared options (prune, allowedProtocols, etc.). |
| onUnsafeUrl | (url, node, type) => void | null | Hook to log/audit when a URL is blocked or altered. |
| allowedProtocols | Set<string> | Default | Allowed schemes (https:, mailto:, etc.). |
| stripParams | string[] | Default | Query parameters (or prefixes like utm_) to remove from the URL. |
Granular Pruning: Use the prune property on link or image to override the global setting. For example, link: { prune: true } will delete unsafe links while the global setting might keep image placeholders.
Gotcha: Handling Placeholders
When a URL is blocked and not pruned (prune: false):
<a>: Thehrefattribute is deleted, and the text[blocked]is appended to the link's content.<img>: Thesrcattribute is deleted, and[blocked]is appended to thealtattribute.
⚖️ Comparison — rehype-harden-urls vs rehype-sanitize
| Capability | rehype-harden-urls (Focus: URL Data) | rehype-sanitize (Focus: HTML Structure) |
| :-------------------------------------------- | :----------------------------------- | :------------------------------------------- |
| Sanitizes link/image URLs (href, src) | ✅ Aggressively | ✅ |
| Strips trackers (e.g., utm_, fbclid) | ✅ | ❌ |
| URL normalization (Unicode NFKC) | ✅ | ❌ |
| Adds rel="noopener noreferrer" | ✅ Automatically | ⚠️ Optional via policy |
| Sanitizes full HTML tags/attributes | ❌ | ✅ |
| Ideal Use | Securely hardening URL content | Preventing XSS via tag/attribute structure |
Best Practice: Always use them together for comprehensive security:
.use(rehypeRaw)
.use(rehypeHardenUrls) // Harden the content inside the attributes
.use(rehypeSanitize) // Strip any remaining dangerous elements/attributes🤝 Contributing
We welcome contributions! Whether it's reporting a bug, suggesting a new feature, or submitting a pull request, your help makes this package better and safer for everyone.
Please check the GitHub Issues for open tasks.
🧰 Related
harden-urls— core URL sanitizerrehype-sanitize— full HTML sanitizationrehype-raw— safely parse embedded HTML - should userehype-sanitizewhen usingrehype-raw
We express gratitude to the Unified, Rehype, and Security research communities.
🪷 License
This project is licensed under the MIT License.
MIT © Mayank Chaudhari
“Do complex things, the simple way.”
