npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2024 – Pkg Stats / Ryan Hefner

simtext

v0.1.7

Published

A lightweight, rule-based text similarity calculator that selects the most appropriate comparison algorithm based on input string lengths.

Downloads

106

Readme

SimText - Lightweight Text Similarity Calculator

SimText is a minimalistic and lightweight text similarity calculator designed for efficiency and ease-of-use. SimText provides a streamlined approach to measure textual likeness.

Features

  • 🪶 Lightweight: Crafted with performance in mind, SimText ensures fast calculations without bogging down your applications.

  • 🔍 Multiple Algorithms:

    • Levenshtein Distance: Ideal for single, short words, offering a precise measure of character-level differences.

    • Jaccard Similarity: Computes similarity between sets of words, making it great for longer texts.

    • N-gram Similarity: Versatile and adaptable, it breaks down text into overlapping chunks for a nuanced similarity measure.

  • 🎯 Contextual Selection: Based on the length and nature of your text inputs, SimText intelligently chooses the most suitable algorithm to offer you the best similarity results.

Installation


npm install  simtext  --save

Usage

This guide provides instructions on how to use the exported functions designed to measure the similarity between two strings. These methods include Levenshtein similarity, Jaccard similarity, n-gram similarity, and a general text comparison function.

1. levenshteinSimilarity(a: string, b: string): number

Compares two strings and returns a similarity score based on the Levenshtein distance.

  • Parameters:
    • a: First string.
    • b: Second string.
  • Return: Similarity score between 0 and 1. A score of 1 means the strings are identical.
import {levenshteinSimilarity} from 'simtext';

const score = levenshteinSimilarity("apples", "apple");
console.log(score);  // 0.8333333333333334

2. jaccardSimilarity(str1: string, str2: string): number

Calculates the Jaccard similarity between two strings, comparing the unique words in each string.

  • Parameters:
    • str1: First string.
    • str2: Second string.
  • Return: Similarity score between 0 and 1.
import {jaccardSimilarity} from 'simtext';

const score = jaccardSimilarity("apple pie", "apple crumble pie");
console.log(score);  // 0.6666666666666666

3. ngramSimilarity(str1: string, str2: string, n?: number): number

Computes the n-gram similarity between two strings. This divides the strings into 'n' consecutive characters and then compares them.

  • Parameters:
    • str1: Fienter code hererst string.
    • str2: Second string.
    • n: (Optional) Number of characters for the n-gram. Default is 2.
  • Return: Similarity score between 0 and 1.
import {ngramSimilarity} from 'simtext';

const score = ngramSimilarity("Roses are red, violets are blue", "Roses are red and the sky is blue", 2);
console.log(score);  // 0.4166666666666667

4. compareText(str1: string, str2: string): number

A comprehensive function that determines the most appropriate similarity method based on the nature of the input strings.

  • Parameters:
    • str1: First string.
    • str2: Second string.
  • Return: Similarity score between 0 and 1, using the method deemed best for the input strings.
import {compareText} from 'simtext';

const score = compareText("apple", "appel");
console.log(score);  // 0.6.

Note: The compareText function uses heuristics to choose the similarity method. For example, if both strings are single words and under 10 characters, it uses the levenshteinSimilarity. If the character count of both strings combined is above 200, it uses jaccardSimilarity. Otherwise, it uses ngramSimilarity.