npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

url-sanitizer

v2.0.0

Published

URL sanitizer for Node.js, browsers and web sites.

Downloads

93

Readme

URL Sanitizer

build CodeQL npm release

URL sanitizer for Node.js, browsers and web sites. Sanitize not only regular URLs, but also data URLs and blob URLs. It also has the ability to parse URLs and verify URIs.

Install

npm i url-sanitizer

For browsers and web sites, standalone ESM builds are available in dist/ directory.

  • node_modules/url-sanitizer/dist/url-sanitizer.min.js
  • node_modules/url-sanitizer/dist/url-sanitizer-wo-dompurify.min.js

Or, download them from Releases.

NOTE: url-sanitizer-wo-dompurify.min.js is built without DOMPurify. If you use it, make sure DOMPurify is exposed globally, e.g. window.DOMPurify.

Usage

import urlSanitizer, {
  isURI, isURISync, parseURL, parseURLSync, sanitizeURL, sanitizeURLSync
} from 'url-sanitizer';

sanitizeURL(url, opt)

Sanitize the given URL.

  • blob, data and file schemes must be explicitly allowed.
  • Given a blob URL, returns a sanitized data URL.

Parameters

  • url string URL input.
  • opt object? Options.
    • opt.allow Array<string>? Array of allowed schemes, e.g. ['data'].
    • opt.deny Array<string>? Array of denied schemes, e.g. ['web+foo'].
    • opt.only Array<string>? Array of specific schemes to allow, e.g. ['git', 'https']. only takes precedence over allow and deny.

Returns Promise<string?> Sanitized URL, nullable.

// Sanitize tags and quotes
const res1 = await sanitizeURL('https://example.com/?<script>alert(1)</script>');
// => 'https://example.com/'

const res1_2 = await sanitizeURL('https://example.com/" onclick="alert(1)"');
// => 'https://example.com/'


// Can parse and sanitize data URL
const res2 = await sanitizeURL('data:text/html,<div><script>alert(1);</script></div><p onclick="alert(2)"></p>', {
  allow: ['data']
})
// => 'data:text/html,%3Cdiv%3E%3C/div%3E%3Cp%3E%3C/p%3E'

console.log(decodeURIComponent(res2));
// => 'data:text/html,<div></div><p></p>'


// Also can parse and sanitize base64 encoded data
const base64data3 = btoa('<div><script>alert(1);</script></div>');
const res3 = await sanitizeURL(`data:text/html;base64,${base64data3}`, {
  allow: ['data']
})
// => 'data:text/html,%3Cdiv%3E%3C/div%3E'

console.log(decodeURIComponent(res3));
// => 'data:text/html,<div></div>'

const base64data3_2 = btoa('<div><img src="javascript:alert(1)"></div>');
const res3_2 = await sanitizeURL(`data:text/html;base64,${base64data3_2}`);
// => 'data:text/html,%3Cdiv%3E%3Cimg%3E%3C/div%3E'

console.log(decodeURIComponent(res3_2));
// => 'data:text/html,<div><img></div>'


// Can parse and sanitize blob URL
const blob4 = new Blob(['<svg><g onload="alert(1)"/></svg>'], {
  type: 'image/svg+xml'
});
const url4 = URL.createObjectURL(blob4);
const res4 = await sanitizeURL(url4, {
  allow: ['blob']
});
// => 'data:image/svg+xml,%3Csvg%3E%3Cg%3E%3C/g%3E%3C/svg%3E'

console.log(decodeURIComponent(res4));
// => 'data:image/svg+xml,<svg><g></g></svg>'


// Deny if the scheme matches the `deny` list
const res5 = await sanitizeURL('web+foo://example.com', {
  deny: ['web+foo']
});
// => null


// Allow only if the scheme matches the `only` list
const res6 = await sanitizeURL('http://example.com', {
  only: ['data', 'git', 'https']
});
// => null

const res6_2 = await sanitizeURL('https://example.com/"onmouseover="alert(1)"', {
  only: ['data', 'git', 'https']
});
// => 'https://example.com/'


// `only` also allows combination of the schemes in the list
const res7 = await sanitizeURL('git+https://example.com/foo.git?<script>alert(1)</script>', {
  only: ['data', 'git', 'https']
});
// => 'git+https://example.com/foo.git'

sanitizeURLSync(url, opt)

Synchronous version of the sanitizeURL().

  • data and file schemes must be explicitly allowed.
  • blob scheme is not supported, returns null. Use async sanitizeURL() for blob.

parseURL(url)

Parse the given URL.

  • Blob URLs are simply parsed and not yet sanitized.

Parameters

Returns Promise<ParsedURL> Result.

ParsedURL

Object with additional properties based on URL API.

Type: object

Properties

  • input string URL input.
  • valid boolean Is valid URI.
  • data object? Parsed result of data URL, nullable.
    • data.mime string? MIME type.
    • data.base64 boolean? Is base64 encoded.
    • data.data string? Data part of the data URL.
  • href string? Sanitized URL input.
  • origin string? Scheme, domain and port of the sanitized URL.
  • protocol string? Protocol scheme of the sanitized URL.
  • username string? Username specified before the domain name.
  • password string? Password specified before the domain name.
  • host string? Domain and port of the sanitized URL.
  • hostname string? Domain of the sanitized URL.
  • port string? Port number of the sanitized URL.
  • pathname string? Path of the sanitized URL.
  • search string? Query string of the sanitized URL.
  • hash string? Fragment identifier of the sanitized URL.
const res1 = await parseURL('javascript:alert(1)');
/* => {
        input: 'javascript:alert(1)',
        valid: false
      } */

const res2 = await parseURL('https://www.example.com/?foo=bar#baz');
/* => {
        input: 'https://www.example.com/?foo=bar#baz',
        valid: true,
        data: null,
        href: 'https://www.example.com/?foo=bar#baz',
        origin: 'https://www.example.com',
        protocol: 'https:',
        hostname: 'www.example.com',
        pathname: '/',
        search: '?foo=bar',
        hash: '#baz',
        ...
      } */

// base64 encoded SVG '<svg><g onclick="alert(1)"/></svg>'
const res3 = await parseURL('');
/* => {
        input: '',
        valid: true,
        data: {
          mime: 'image/svg+xml',
          base64: false,
          data: '%3Csvg%3E%3Cg%3E%3C/g%3E%3C/svg%3E'
        },
        href: '');
/* => {
        input: '',
        valid: true,
        data: {
          mime: 'image/png',
          base64: true,
          data: 'iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg=='
        },
        href: '',
        origin: 'null',
        protocol: 'data:',
        pathname: 'image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUAAAAFCAYAAACNbyblAAAAHElEQVQI12P4//8/w38GIAXDIBKE0DHxgljNBAAO9TXL0Y4OHwAAAABJRU5ErkJggg==',
        ...
      } */

// Note that blob URLs are parsed but not yet sanitized
const blob5 = new Blob(['<svg><g onload="alert(1)"/></svg>'], {
  type: 'image/svg+xml'
});
const url5 = URL.createObjectURL(blob5);
const res5 = await parseURL(url5);
/* => {
        input: 'blob:nodedata:82ecc5a4-aea8-48d7-a407-64e2ef0913da',
        valid: true,
        data: null,
        href: 'blob:nodedata:82ecc5a4-aea8-48d7-a407-64e2ef0913da',
        origin: 'null',
        protocol: 'blob:',
        pathname: 'nodedata:82ecc5a4-aea8-48d7-a407-64e2ef0913da',
        ...
      } */

parseURLSync(url)

Synchronous version of the parseURL().

isURI(uri)

Verify if the given URI is valid and registered.

Parameters

Returns Promise<boolean> Result.

  • Always true for web+* and ext+* schemes, except web+javascript, web+vbscript, ext+javascript, ext+vbscript.
  • false for javascript and vbscript schemes.
const res1 = await isURI('https://example.com/foo');
// => true

const res2 = await isURI('javascript:alert(1)');
// => false

const res3 = await isURI('mailto:[email protected]');
// => true

const res4 = await isURI('foo:bar');
// => false

const res5 = await isURI('web+foo:bar');
// => true

const res6 = await isURI('web+javascript:alert(1)');
// => false

isURISync(uri)

Synchronous version of the isURI().


urlSanitizer

Instance of the sanitizer.

urlSanitizer.get()

Get a list of registered URI schemes.

Returns Array<string> Array of registered URI schemes.

  • Includes schemes registered at iana.org by default.
    • Historical schemes omitted.
    • moz-extension scheme added.
  • Also includes custom schemes added via urlSanitizer.add().
const schemes = urlSanitizer.get();
// => ['aaa', 'aaas', 'about', 'acap', 'acct', ...]

urlSanitizer.has(scheme)

Check if the given scheme is registered.

Parameters

Returns boolean Result.

const res1 = urlSanitizer.has('https');
// => true

const res2 = urlSanitizer.has('foo');
// => false

urlSanitizer.add(scheme)

Add a scheme to the list of registered URI schemes.

  • javascript and vbscript schemes can not be registered. It throws.

Parameters

Returns Array<string> Array of registered URI schemes.

console.log(urlSanitizer.has('foo'));
// => false

const res = urlSanitizer.add('foo');
// => ['aaa', 'aaas', 'about', 'acap', ... 'foo', ...]

console.log(urlSanitizer.has('foo'));
// => true

urlSanitizer.remove(scheme)

Remove a scheme from the list of registered URI schemes.

Parameters

Returns boolean Result.

  • true if the scheme is successfully removed, false otherwise.
console.log(urlSanitizer.has('aaa'));
// => true

const res1 = urlSanitizer.remove('aaa');
// => true

console.log(urlSanitizer.has('aaa'));
// => false

const res2 = urlSanitizer.remove('foo');
// => false

urlSanitizer.reset()

Reset sanitizer.

Returns void


Acknowledgments

The following resources have been of great help in the development of the URL Sanitizer.


Copyright (c) 2023 asamuzaK (Kazz)