npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

ukkonen

v2.1.0

Published

Ukkonens approximate string matching algorithm for finding edit distance similar to Levenshtein

Downloads

39,501

Readme

Ukkonen - Approximate String Matching

npm version Build Status

This project implements the Approximate String Matching algorithm by Esko Ukkonen extended with ideas from An Extension of Ukkonen's Enhanced Dynamic Programming ASM Algorith by Hal Berghel and David Roach.

Ukkonen's algorithm is very competitive with the Levenshtein distance and for longer strings it is much more performant than Levenshtein distance.

In addition to being a competitive alternative to Levenshtein distance, Ukkonen's algorithm also allows you to provide a threshold for the distance which increases the performance even more for texts that are longer than the threshold.

Above you can see the different of using Levenshtein distance and Ukkonen's algorithm for matching sub-trees when diffing HTML.

Install

npm install --save ukkonen

Usage

You can find the distance between the strings Ukkonen and Levenshtein the following way:

var ukkonen = require("ukkonen");

assert.equal(ukkonen("Ukkonen", "Levenshtein"), 8);

If you want to limit the distance by a given threshold:

var ukkonen = require("ukkonen");

assert.equal(ukkonen("Ukkonen", "Levenshtein", 6), 6);
assert.equal(ukkonen("Ukkonen", "Levenshtein", 10), 8);

Platform support

The library is ES6 and will work with any JavaScript bundler in the browser as well as Node versions with ES6 support.

Benchmark

I have benchmarked the library against the fastest Levenshtein distance implementation on NPM.

Running benchmarks with 1000 iterations

# ukkonen: Edit distance one word (14 examples)
ok ~18 ms (0 s + 17993165 ns)

# leven: Edit distance one word (14 examples)
ok ~13 ms (0 s + 13155407 ns)

# ukkonen: Edit distance on sentence with small differences
ok ~1.66 ms (0 s + 1656841 ns)

# leven: Edit distance on sentence with small differences
ok ~7.23 ms (0 s + 7233814 ns)

# ukkonen: Edit distance on paragraphs with small differences
ok ~5.37 ms (0 s + 5367561 ns)

# leven: Edit distance on paragraphs with small differences
ok ~416 ms (0 s + 416468504 ns)

# ukkonen: Edit distance on longer texts with small differences
ok ~10 ms (0 s + 10305586 ns)

# leven: Edit distance on longer texts with small differences
ok ~1.7 s (1 s + 703731130 ns)

# ukkonen: Edit distance on longer texts with many differences
ok ~3.28 s (3 s + 280166305 ns)

# leven: Edit distance on longer texts with many differences
ok ~2.52 s (2 s + 519432479 ns)

# ukkonen: Edit distance on longer texts with small differences and a threshold of 20
ok ~9.69 ms (0 s + 9691021 ns)

# leven: Edit distance on longer texts with small differences and a threshold of 20
ok ~1.61 s (1 s + 610079082 ns)

# ukkonen: Edit distance on longer texts with many differences and a threshold of 40
ok ~15 ms (0 s + 15225792 ns)

# leven: Edit distance on longer texts with many differences and a threshold of 40
ok ~2.54 s (2 s + 539519721 ns)

Acknowledgements

Obviously the authors of the papers describing the algorithm Esko Ukkonen, Hal Berghel and David Roach.

I stole a lot of ideas from Sindre Sorhus's leven library and I also used it to test my implementation against.

License

MIT © Sune Simonsen