npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@sygnl/text-normalizer

v1.0.0

Published

Universal text normalization utilities for NLP, search, and data processing

Readme

@sygnl/text-normalizer

License TypeScript Tests

Universal text normalization utilities for NLP, search, and data processing. Built with TypeScript for type safety and developer experience.

Features

  • 🔤 Accent Removal - Remove diacritics and accents (café → cafe)
  • 🔧 Punctuation Stripping - Remove or selectively preserve special characters
  • 📏 Whitespace Normalization - Collapse and trim whitespace intelligently
  • 🌍 Unicode Support - Handle multilingual text correctly
  • Zero Dependencies - Lightweight and fast
  • 📦 Tree-Shakeable - Import only what you need
  • 🎯 TypeScript First - Full type definitions and IntelliSense support
  • 🧪 Well Tested - Comprehensive test coverage

Installation

npm install @sygnl/text-normalizer
yarn add @sygnl/text-normalizer
pnpm add @sygnl/text-normalizer

Quick Start

import { normalize } from '@sygnl/text-normalizer';

// Basic usage - applies all normalizations
const text = "  CAFÉ!! Hello, World! 🌍  ";
const result = normalize(text);
console.log(result); // Output: "cafe hello world"

Usage

Full Normalization (Default)

import { normalize } from '@sygnl/text-normalizer';

normalize("Crème Brûlée @ $12.99!!!");
// Output: "creme brulee 1299"

Selective Normalization

import { normalize } from '@sygnl/text-normalizer';

// Keep uppercase and punctuation
normalize("HELLO, World!", {
  lowercase: false,
  stripPunctuation: false
});
// Output: "HELLO, World!"

// Preserve specific characters
normalize("[email protected]", {
  preserveChars: ['@', '.']
});
// Output: "[email protected]"

Individual Functions

import {
  removeAccents,
  stripPunctuation,
  collapseWhitespace,
  trim,
  lowercase
} from '@sygnl/text-normalizer';

removeAccents("café résumé");     // "cafe resume"
stripPunctuation("Hello, World!"); // "Hello World"
collapseWhitespace("a    b");      // "a b"
trim("  hello  ");                 // "hello"
lowercase("HELLO");                // "hello"

Detailed Normalization (Debug Mode)

import { normalizeDetailed } from '@sygnl/text-normalizer';

const result = normalizeDetailed("  CAFÉ!!  ");
console.log(result);
/*
{
  original: "  CAFÉ!!  ",
  normalized: "cafe",
  steps: {
    afterLowercase: "  café!!  ",
    afterAccentRemoval: "  cafe!!  ",
    afterPunctuationStrip: "  cafe  ",
    afterWhitespaceCollapse: " cafe ",
    afterTrim: "cafe"
  }
}
*/

Custom Replacements

import { normalize } from '@sygnl/text-normalizer';

normalize("hello world", {
  customReplacements: {
    'hello': 'hi',
    'world': 'there'
  }
});
// Output: "hi there"

// Use regex patterns
normalize("test123", {
  customReplacements: {
    '\\d+': 'NUM'
  }
});
// Output: "testnum"

API Reference

normalize(text: string, options?: NormalizationOptions): string

Main normalization function with configurable options.

Parameters:

  • text - The input text to normalize
  • options - Optional configuration object

Options:

interface NormalizationOptions {
  lowercase?: boolean;              // Convert to lowercase (default: true)
  removeAccents?: boolean;          // Remove accents/diacritics (default: true)
  stripPunctuation?: boolean;       // Remove punctuation (default: true)
  collapseWhitespace?: boolean;     // Collapse whitespace (default: true)
  trim?: boolean;                   // Trim leading/trailing space (default: true)
  customReplacements?: Record<string, string>;  // Custom find/replace
  preserveChars?: string[];         // Characters to keep when stripping
}

normalizeDetailed(text: string, options?: NormalizationOptions): NormalizationResult

Returns normalization result with detailed step-by-step breakdown.

Returns:

interface NormalizationResult {
  original: string;
  normalized: string;
  steps: {
    afterLowercase?: string;
    afterAccentRemoval?: string;
    afterPunctuationStrip?: string;
    afterWhitespaceCollapse?: string;
    afterTrim?: string;
  };
}

Individual Functions

removeAccents(text: string): string

Remove accents and diacritical marks from text.

stripPunctuation(text: string, options?: StripOptions): string

Remove punctuation and special characters.

Options:

interface StripOptions {
  preserve?: string[];           // Characters to preserve
  keepAlphanumeric?: boolean;    // Keep letters/numbers (default: true)
}

collapseWhitespace(text: string, options?: WhitespaceOptions): string

Collapse multiple whitespace into single spaces.

Options:

interface WhitespaceOptions {
  replaceTabs?: boolean;         // Replace tabs with spaces (default: true)
  replaceNewlines?: boolean;     // Replace newlines with spaces (default: true)
  collapseSpaces?: boolean;      // Collapse multiple spaces (default: true)
}

trim(text: string): string

Remove leading and trailing whitespace.

lowercase(text: string): string

Convert text to lowercase.

applyReplacements(text: string, replacements: Record<string, string>): string

Apply custom find/replace patterns.

Real-World Examples

E-commerce Product Matching

import { normalize } from '@sygnl/text-normalizer';

const titles = [
  "Men's Café Racer Leather Jacket - Black",
  "mens cafe racer leather jacket black",
  "MEN'S CAFÉ RACER LEATHER JACKET (BLACK)"
];

// All normalize to the same string for comparison
const normalized = titles.map(t => normalize(t));
console.log(normalized[0] === normalized[1]); // true
console.log(normalized[0] === normalized[2]); // true

Search Query Normalization

import { normalize } from '@sygnl/text-normalizer';

const userQueries = [
  'crème brûlée recipe',
  'Creme Brulee Recipe',
  'CRÈME BRÛLÉE RECIPE!!!'
];

const searchTerm = normalize(userQueries[0]);
// Use searchTerm for database query

Email Address Normalization

import { normalize } from '@sygnl/text-normalizer';

const emails = [
  '  [email protected]  ',
  '[email protected]',
  '[email protected]!!!'
];

const normalized = emails.map(e => 
  normalize(e, { preserveChars: ['@', '.'] })
);
// All become: "[email protected]"

Multilingual Text Processing

import { normalize } from '@sygnl/text-normalizer';

const translations = {
  french: 'Crème Brûlée',
  spanish: 'Señorita Niño',
  german: 'Übermensch Äpfel'
};

Object.entries(translations).forEach(([lang, text]) => {
  console.log(`${lang}: ${normalize(text)}`);
});
// french: creme brulee
// spanish: senorita nino
// german: ubermensch apfel

TypeScript Support

Full TypeScript definitions are included. Import types as needed:

import type {
  NormalizationOptions,
  NormalizationResult,
  WhitespaceOptions,
  StripOptions,
  CharacterClass
} from '@sygnl/text-normalizer';

Performance

  • Zero dependencies - No external packages required
  • Lightweight - < 5KB minified
  • Fast - Optimized for performance with minimal allocations
  • Tree-shakeable - Import only the functions you need

Browser Support

Works in all modern browsers and Node.js environments:

  • ✅ Node.js 14+
  • ✅ Chrome, Firefox, Safari, Edge (latest versions)
  • ✅ ES2020+ environments

Use Cases

  • 🔍 Search & Indexing - Normalize text before indexing for better search results
  • 🛒 E-commerce - Match product titles across different stores
  • 🌐 i18n - Handle multilingual text consistently
  • 📊 Data Cleaning - Prepare text data for analysis
  • 🤖 NLP Pipelines - First step in text processing workflows
  • 🔐 User Input - Sanitize and standardize user-entered data

Contributing

Contributions are welcome! This package is part of the UPID ecosystem.

License

Apache License 2.0 - see LICENSE file for details

Author

Edge Foundry Inc. - 2206


Made with ❤️ for the text processing community