npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

kanji-data

v1.1.0

Published

A distilled, offline-first kanji database with zero dependencies. Instant access to 13,000+ kanji and vocabulary via build-time sharding and lazy evaluation.

Readme

kanji-data 👹 — Offline Kanji Database for Node.js

License: MIT NPM Version NPM Downloads

A distilled, offline-first kanji database for Node.js with zero dependencies. Provides instant access to 13,000+ kanji characters and vocabulary, optimized with lazy-loading shards for memory-constrained serverless environments.

⚡️ Production Use: This library is used to assist in compiling the comprehensive kanji data for Jepang.org.


The Problem

Typically, accessing a comprehensive Japanese dictionary offline means parsing a massive 100MB+ JSON file.

  • Loading a file that large blocks the Node.js event loop, resulting in terrible app startup times.
  • It easily consumes 300MB+ of RAM once parsed, which instantly crashes serverless environments (like AWS Lambda, Vercel, or Netlify).
  • Relying on local databases (like SQLite) often introduces bulky C++ dependencies (node-gyp) that cause cross-platform installation errors.

The Solution

kanji-data solves the memory problem using build-time data sharding and lazy evaluation.

Instead of shipping one massive file, the database is pre-compiled into tiny optimized chunks. Core metadata is loaded instantly, while massive vocabulary lists are split by Unicode hex-prefix and only loaded into memory (~1MB at a time) exactly when requested.

Features

  • 📦 Zero Dependencies: Pure JavaScript and JSON. No databases, no binaries.
  • Serverless Ready: Cold starts are nearly instantaneous with a tiny memory footprint.
  • 📴 100% Offline: No API keys, no rate limits, no network latency.
  • 🧠 Smart Caching: Chunks are cached in memory after the first read for lightning-fast subsequent queries.
  • 🔷 TypeScript Ready: Full .d.ts type definitions included.

Installation

npm install kanji-data

Usage

const kanji = require('kanji-data');

// 1. Get core kanji metadata (meanings, readings, stroke count, etc.)
const neko = kanji.get('猫');
console.log(neko.meanings);       // ['cat']
console.log(neko.kun_readings);   // ['ねこ']
console.log(neko.jlpt);           // 3
console.log(neko.stroke_count);   // 11

// 2. Fetch vocabulary containing a specific kanji
// (lazily loads the required ~1MB vocabulary shard on first call)
const nekoWords = kanji.getWords('猫');
console.log(nekoWords[0]);
/*
{
  "variants": [
    { "written": "猫", "pronounced": "ねこ", "priorities": ["spec1"] }
  ],
  "meanings": [
    { "glosses": ["cat"] }
  ]
}
*/

// 3. Get lists of kanji by JLPT level (N5 to N1)
const n5Kanji = kanji.getJlpt(5);
console.log(n5Kanji); // ['一', '二', '三', '日', '月', ...]

// 4. Get lists of kanji by school grade
const grade1 = kanji.getGrade(1);
console.log(grade1); // ['一', '右', '雨', '円', '王', ...]

// 5. Get all kanji in the database
const all = kanji.getAll();
console.log(all.length); // 13108

// 6. Extract kanji from any Japanese text
const found = kanji.extractKanji('私は猫が好きです');
console.log(found); // ['私', '猫', '好']

// 7. Search by meaning or reading
const results = kanji.search('fire');
console.log(results[0].kanji); // '火'

// 8. Get a random kanji (optionally filtered)
const random = kanji.getRandom({ jlpt: 5 });
console.log(random.kanji); // (random N5 kanji)

API Reference

get(character: string): KanjiMetadata | null

Returns core metadata for a given kanji character. Returns null if not found.

{
  kanji: "猫",
  grade: 8,                      // School grade (1–6, 8–9) or null
  stroke_count: 11,
  meanings: ["cat"],
  kun_readings: ["ねこ"],
  on_readings: ["ビョウ"],
  name_readings: [],
  jlpt: 3,                       // JLPT level (1–5) or null
  unicode: "732B",
  heisig_en: "cat",              // Heisig keyword (may be null)
  freq_mainichi_shinbun: 1702,   // Newspaper frequency rank (may be null)
  notes: []
}

getWords(character: string): Word[]

Returns an array of vocabulary words that use the specified kanji. Returns [] if none found.

Uses lazy loading — the first call reads a ~1MB shard from disk and caches it. Subsequent calls in the same shard are instantaneous.

{
  variants: [
    {
      written: "猫",
      pronounced: "ねこ",
      priorities: ["spec1", "ichi1"]   // frequency lists (may be empty)
    }
  ],
  meanings: [
    { glosses: ["cat"] }
  ]
}

getJlpt(level: number): string[]

Returns kanji in the specified JLPT level (1–5). Returns [] for invalid levels.

kanji.getJlpt(5);  // ['一', '二', '三', ...]  ← N5 (easiest)
kanji.getJlpt(1);  // ['蹴', '串', '厨', ...]  ← N1 (hardest)

getGrade(grade: number): string[]

Returns kanji taught in the specified Japanese school grade. Returns [] for grades with no data.

| Grade | Level | |---|---| | 1–6 | Elementary school (教育漢字) | | 8 | Secondary school / Jōyō kanji not in grades 1–6 | | 9 | Jinmeiyō kanji (used in names) |

kanji.getGrade(1); // ['一', '右', '雨', ...']
kanji.getGrade(8); // ['亜', '哀', '握', ...']

getAll(): string[]

Returns an array of all ~13,000 kanji characters in the database.

const allKanji = kanji.getAll();
console.log(allKanji.length); // 13108

extractKanji(text: string): string[]

Extracts unique kanji characters from a string of Japanese text. Only returns characters present in the database.

kanji.extractKanji('私は猫が好きです');
// ['私', '猫', '好']

kanji.extractKanji('hello'); // []
kanji.extractKanji('ひらがなだけ'); // []

search(query: string): KanjiMetadata[]

Searches for kanji by English meaning or Japanese reading. Performs case-insensitive partial matching on meanings, kun readings, and on readings.

kanji.search('cat');     // [{ kanji: '猫', meanings: ['cat'], ... }, ...]
kanji.search('ねこ');    // [{ kanji: '猫', ... }]
kanji.search('fire');    // [{ kanji: '火', ... }, ...]

getByStrokeCount(count: number): KanjiMetadata[]

Returns an array of kanji with the specified stroke count. Returns [] for invalid input (zero, negative, non-integer).

kanji.getByStrokeCount(1);  // [{ kanji: '一', stroke_count: 1, ... }, ...]
kanji.getByStrokeCount(11); // [{ kanji: '猫', ... }, ...]
kanji.getByStrokeCount(0);  // []

getRandom(options?: { jlpt?: number, grade?: number }): KanjiMetadata | null

Returns a random kanji, optionally filtered by JLPT level and/or school grade. Returns null when no kanji match the filters.

kanji.getRandom();               // { kanji: '猫', ... } (any random kanji)
kanji.getRandom({ jlpt: 5 });    // guaranteed N5 kanji
kanji.getRandom({ grade: 1 });   // guaranteed grade 1 kanji
kanji.getRandom({ jlpt: 5, grade: 1 }); // both filters applied
kanji.getRandom({ grade: 99 });  // null (no match)

searchWords(query: string): Word[]

Searches for vocabulary words by English meaning or reading across all shards. Performs case-insensitive partial matching on glosses and readings.

⚠️ Performance Note: The first call loads all word shards (~100 files) into memory. Subsequent calls are instant due to caching.

kanji.searchWords('cat');  // [{ variants: [...], meanings: [{ glosses: ['cat'] }] }, ...]
kanji.searchWords('ねこ'); // finds words with reading ねこ
kanji.searchWords('xyz');  // []

Examples

The examples/ directory contains a fully interactive console quiz that demos the package.

# Run the quiz directly (data is included!)
node examples/quiz.js

# Options
node examples/quiz.js --level=5      # N5 only (easiest, 79 kanji)
node examples/quiz.js --level=3      # N5–N3 (default, ~600 kanji)
node examples/quiz.js --rounds=20    # longer session

Each round presents a 4-option multiple-choice question — either "guess the meaning" or "which kanji matches this reading". After every answer it shows example vocabulary words loaded live from the data shards.

Contributing

Found a bug or want to improve the data pipeline? PRs are welcome!

  • Bug reports → GitHub Issues
  • The raw data lives in references/kanjiapi_full.json
  • Run npm test before submitting a PR

About

kanji-data is an npm package authored and maintained by Septian Ganendra S. K. at Jepang.org — Indonesia's comprehensive Japanese learning platform. This package optimizes and repackages the kanjiapi.dev dataset into lazy-loading shards for production Node.js use.

📚 If you use this package in your project, we'd appreciate a link back to Jepang.org! It helps us continue maintaining and expanding this free resource for Japanese learners worldwide.

Related Packages

  • kanji-png — Generate kanji PNGs and animated stroke-order GIFs.
  • kotowaza — Japanese proverbs (ことわざ) dataset with bilingual meanings and JLPT levels.

Attribution & License

This package is licensed under the MIT License — see LICENSE for details.

The underlying dictionary data originates from kanjiapi.dev (MIT), which uses the EDICT and KANJIDIC dictionary files — the property of the Electronic Dictionary Research and Development Group, used in conformance with the Group's licence. JLPT level data sourced from Jonathan Waller's JLPT Resources.


MIT © Septian Ganendra S. K.