npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

utf8proc.c

v2.11.2

Published

A clean C library for processing UTF-8 Unicode data; Steven G. Johnson (2014).

Readme

utf8proc

CI

utf8proc is a small, clean C library that provides Unicode normalization, case-folding, and other operations for data in the UTF-8 encoding. It was initially developed by Jan Behrens and the rest of the Public Software Group, who deserve nearly all of the credit for this package. With the blessing of the Public Software Group, the Julia developers have taken over development of utf8proc, since the original developers have moved to other projects.

utf8proc is used for basic Unicode support in the Julia language, and the Julia developers became involved because they wanted to add Unicode 7 support and other features; it is now regularly updated to keep up with recent Unicode releases.

There are also utf8proc wrappers for Ruby and Rust (github). (The original utf8proc package also included PostgreSQL bindings.)

The utf8proc package is licensed under the free/open-source MIT "expat" license (plus certain Unicode data governed by the similarly permissive Unicode data license); please see the included LICENSE.md file for more detailed information.

Quick Start

Typical users should download a utf8proc release rather than cloning directly from github.

For compilation of the C library, run make. You can also install the library and header file with make install (by default into /usr/local/lib and /usr/local/bin, but this can be changed by make prefix=/some/dir). make check runs some tests, and make clean deletes all of the generated files.

Alternatively, you can compile with cmake, e.g. by

mkdir build
cmake -S . -B build
cmake --build build

Using other compilers

The included Makefile supports GNU/Linux flavors and MacOS with gcc-like compilers; Windows users will typically use cmake.

For other Unix-like systems and other compilers, you may need to pass modified settings to make in order to use the correct compilation flags for building shared libraries on your system.

For HP-UX with HP's aCC compiler and GNU Make (installed as gmake), you can compile with

gmake CC=/opt/aCC/bin/aCC CFLAGS="+O2" PICFLAG="+z" C99FLAG="-Ae" WCFLAGS="+w" LDFLAG_SHARED="-b" SOFLAG="-Wl,+h"

To run gmake install you will need GNU coreutils for the install command, and you may want to pass prefix=/opt libdir=/opt/lib/hpux32 or similar to change the installation location.

Using with CMake

A CMake Config-file package is provided. To use utf8proc in a CMake project:

add_executable (app app.c)
find_package (utf8proc 2.9.0 REQUIRED)
target_link_libraries (app PRIVATE utf8proc::utf8proc)

General Information

The C library is found in this directory after successful compilation and is named libutf8proc.a (for the static library) and libutf8proc.so (for the dynamic library).

The Unicode version supported is 17.0.0.

For Unicode normalizations, the following options are used:

  • Normalization Form C: STABLE, COMPOSE
  • Normalization Form D: STABLE, DECOMPOSE
  • Normalization Form KC: STABLE, COMPOSE, COMPAT
  • Normalization Form KD: STABLE, DECOMPOSE, COMPAT

C Library

The documentation for the C library is found in the utf8proc.h header file. utf8proc_map is function you will most likely be using for mapping UTF-8 strings, unless you want to allocate memory yourself.

To Do

See the Github issues list.

Contact

Bug reports, feature requests, and other queries can be filed at the utf8proc issues page on Github.

See also

An independent Lua translation of this library, lua-mojibake, is also available.

Examples

Convert codepoint to string

// Convert codepoint `a` to utf8 string `str`
utf8proc_int32_t a = 223;
utf8proc_uint8_t str[16] = { 0 };
utf8proc_encode_char(a, str);
printf("%s\n", str);
// ß

Convert string to codepoint

// Convert string `str` to pointer to codepoint `a`
utf8proc_uint8_t str[] = "ß";
utf8proc_int32_t a;
utf8proc_iterate(str, -1, &a);
printf("%d\n", a);
// 223

Casefold

// Convert "ß"  (U+00DF) to its casefold variant "ss"
utf8proc_uint8_t str[] = "ß";
utf8proc_uint8_t *fold_str;
utf8proc_map(str, 0, &fold_str, UTF8PROC_NULLTERM | UTF8PROC_CASEFOLD);
printf("%s\n", fold_str);
// ss
free(fold_str);

Normalization Form C/D (NFC/NFD)

// Decompose "\u00e4\u00f6\u00fc" = "äöü" into "a\u0308o\u0308u\u0308" (= "äöü" via combining char U+0308)
utf8proc_uint8_t input[] = {0xc3, 0xa4, 0xc3, 0xb6, 0xc3, 0xbc}; // "\u00e4\u00f6\u00fc" = "äöü" in UTF-8
utf8proc_uint8_t *nfd= utf8proc_NFD(input); // = {0x61, 0xcc, 0x88, 0x6f, 0xcc, 0x88, 0x75, 0xcc, 0x88}

// Compose "a\u0308o\u0308u\u0308" into "\u00e4\u00f6\u00fc" (= "äöü" via precomposed characters)
utf8proc_uint8_t *nfc= utf8proc_NFC(nfd);

free(nfd);
free(nfc);

SRC ORG