npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2025 – Pkg Stats / Ryan Hefner

harden-urls

v1.0.0

Published

🛡️ URL hardening and sanitization utilities with safe defaults, pattern-based allow/block lists, and cleanup for links and mailto URIs.

Downloads

36

Readme

harden-urls 🛡️ Core URL Sanitizer Utilities

test codecov Version Downloads npm bundle size NPM License

harden-urls Banner

The robust, protocol-aware, and dependency-free URL sanitizer for secure markdown and user-generated web content.


🎯 Why harden-urls? The Security Gap

Most URL sanitization techniques (like basic URL prefix checks) are insufficient for preventing sophisticated attacks in modern web environments, especially when rendering user-submitted or ai-generated (e.g., compromised due to prompt poisoning) content or markdown.

Libraries like Vercel Lab's markdown-sanitizers are a great start, but they often ignore deep-seated vulnerabilities:

  • Protocol Evasion: Using encoded control characters (%0A or \u200B) to hide malicious protocols like javascript: or data:.
  • Homoglyph & Unicode Spoofing: Using visually similar Unicode characters to bypass domain allow-lists.
  • Tracking & Phishing: Failure to clean up unsafe or excessive tracking parameters (e.g., in mailto: links).
  • Prefix Boundary Confusion: Simple prefix-matching libraries may trust a URL if it starts with an allowed domain, but fail to detect malicious subdomains or path components: // If allowed prefix is 'https://allowed-prefix.com' // ⚠️ Prefix-based libraries often allow this malicious URL: https://allowed-prefix.com.evil.com/bypass.js Reason: These libraries only check the starting string, missing the critical step of parsing the hostname to confirm the actual origin.

👉 harden-urls bridges this gap by offering a multi-layered defense as the foundation for secure URL processing.


✨ Features at a Glance

| Feature | Description | Security Posture | | :--------------------------- | :--------------------------------------------------------------------------------------------- | :---------------------- | | Protocol-First | Only protocols in the safeProtocols list (e.g., https:, mailto:) are allowed by default. | Sane Defaults | | Pattern-Based Filtering | Allows granular control via allowedPatterns (domains/paths) and blockedPatterns. | Explicit Opt-In | | Query Param Cleaning | Strips common tracking/malicious parameters (utm_, body, subject, etc.) automatically. | Defense-in-Depth | | Unicode Hardening (NFKC) | Normalizes input and strips control/zero-width characters to mitigate obfuscation attacks. | Obfuscation Defense | | Minimal & Typed | Zero dependencies, highly performant, and built 100% in TypeScript. | Reliability |


🚀 Installation & Setup

pnpm add harden-urls # Recommended

or

npm install harden-urls

or

yarn add harden-urls

Basic Usage

Use createUrlSanitizer to pre-configure your security rules for optimal performance.

import { createUrlSanitizer, toRegexps } from "harden-urls";

// Helper to convert simple strings/patterns into robust regexes
const trustedDomains = toRegexps([
  "*.mycorp.com", // Allow subdomains
  "partner-api.io", // Exact domain match
]);

const sanitizer = createUrlSanitizer({
  // Only allow HTTPS and Mailto protocols
  allowedProtocols: ["https:", "mailto:"],

  // Allow URLs matching our specific domain patterns
  allowedPatterns: trustedDomains,

  // Automatically remove common tracking params from all URLs
  stripParams: ["utm_", "fbclid", "gclid"],

  // OPTIONAL: Configure for specific edge cases
  allowSchemaRelative: true, // Allow //example.com (defaults to false)
  // blockPathTraversal defaults to true, blocking /../
});

// Example 1: Clean and safe
sanitizer(
  "[https://sub.mycorp.com/docs?utm_source=email](https://sub.mycorp.com/docs?utm_source=email)"
);
// → "[https://sub.mycorp.com/docs](https://sub.mycorp.com/docs)" (tracking param stripped)

// Example 2: Blocked by protocol
sanitizer("ftp://insecure.net/file");
// → null (ftp: is not in allowedProtocols)

// Example 3: Blocked by pattern
sanitizer("[https://evil-tracking.com/path](https://evil-tracking.com/path)");
// → null (does not match allowedPatterns)

⚙️ API Reference

sanitizeUrl(url: string, options?: SanitizeOptions): string | null

The core function. Cleans the input URL and returns the sanitized string, or null if the URL is blocked by any rule.

isSafeUrl(url: string, options?: SanitizeOptions): boolean

A boolean check. Returns true if sanitizeUrl would return a non-null string.

createUrlSanitizer(options?: SanitizeOptions): (url: string) => string | null

The preferred method. Returns a pre-configured sanitizer function for performance and cleaner code.

SanitizeOptions (Key Security Flags)

| Option | Type | Default | Description | | :-------------------- | :--------- | :------------------------- | :----------------------------------------------------------- | | allowedProtocols | string[] | ['https:', 'http:', ...] | Protocols permitted. Primary security control. | | allowedPatterns | RegExp[] | [] | Only URLs matching these patterns are allowed (if provided). | | blockedPatterns | RegExp[] | [] | URLs matching these are blocked immediately. | | stripParams | string[] | ['utm_', 'fbclid', ...] | Query parameter names/prefixes to strip. | | allowSchemaRelative | boolean | false | Allows //example.com. Requires explicit opt-in. | | blockPathTraversal | boolean | true | Prevents relative paths (/path) containing .. segments. |


🔒 Best Practices and Gotchas

1. Always Use allowedProtocols

The protocol whitelist is your most important defense. If you need to allow data: URLs, ensure your pattern list tightly restricts the media type (e.g., data:image/png).

2. Check Global Flags (/g)

Gotcha: If you manually construct a RegExp for patterns, do not use the global (/g) flag. The test() method with /g maintains state, which can lead to security checks being accidentally skipped. Use helpers like toRegexps to avoid this.

3. harden-urls is Not an HTML Sanitizer

Crucial: This library only sanitizes the URL attribute value. It does not protect against arbitrary HTML or script tags within the content itself.

  • If rendering markdown, you must use a comprehensive HTML sanitizer like rehype-sanitize alongside this library if you allow any kind of raw HTML or components.

🤝 Contribution

We welcome contributions! If you find a security vulnerability, have a feature request, or want to fix a bug, please:

  1. Open an Issue: Discuss the change you wish to make. Security bugs should be reported privately first, if possible.
  2. Submit a Pull Request: Ensure your code passes all tests and is formatted correctly. New features require tests!

This project is maintained by a small team and relies on community feedback and contributions. Thank you!


Comparison Table

| Library | Safe Defaults | Pattern-Based | Param Cleanup | Protocol Config | Unicode Hardening | | :---------------------------------- | :--------------------------- | :------------ | :------------ | :-------------- | :---------------- | | harden-urls | ✅ Comprehensive | ✅ Yes | ✅ Yes | ✅ Granular | ✅ NFKC | | vercel-labs/markdown-sanitizers | ⚠️ Partial (URL prefix only) | ❌ | ❌ | ⚠️ Limited | ❌ |

⚠️ Important: Neither harden-urls nor Vercel’s harden-react-markdown protects against arbitrary HTML — you must use rehype-sanitize if using rehype-raw.


Related Packages


Acknowledgments

Grateful to:

They laid the groundwork — this package aims to bridge the gap between safety and flexibility.


License

This library is licensed under the MIT open-source license.