staticseek
v2.7.1
Published
A Lightweight, Fast Full-text Search Engine Supporting All Unicode Languages
Readme
staticseek: A Lightweight, Fast Full-text Search Engine Supporting All Unicode Languages
For detailed instructions on how to use staticseek, please visit the official website.
This package is developed using Node.js v22.
To build staticseek, execute the following command:
git clone https://github.com/osawa-naotaka/staticseek.git
cd staticseek
npm install
npm run buildThe JavaScript files will be generated in the dist directory.
Additionally, by running the npm run dev command, you can execute the application and benchmark. Please refer to the comments in index.html.
Overview
staticseek is a client-side full-text search engine designed specifically for static websites. It enables searching through arrays of JavaScript objects containing strings or string arrays. By converting your articles into JavaScript objects, you can implement full-text search functionality on static sites without any server-side implementation.
Key Features
- Simple and intuitive API
- Support for fuzzy search with customizable edit distance
- Advanced search operations (AND, OR, NOT)
- Field-specific search capabilities
- TF-IDF based scoring with customizable weights
- Google-like query syntax
- Unicode support for all languages including CJK characters and emojis
- Multiple index implementations for different performance needs
- Seamless integration with popular Static Site Generators (SSG)
Quick Start
First, install staticseek in your environment:
npm install staticseekAlternatively, you can directly import staticseek using jsDelivr's CDN service.
Next, import staticseek into your project and perform indexing and searching.
Here, array_of_articles represents an array of JavaScript objects containing the text to be searched.
import { LinearIndex, createIndex, search, StaticSeekError } from "staticseek";
// Create an index
const index = createIndex(LinearIndex, array_of_articles);
if(index instanceof StaticSeekError) throw index;
// Perform a search
const result = await search(index, "search word");
if(result instanceof StaticSeekError) throw result;
for(const r of result) {
console.log(array_of_articles[r.id]);
}The search results are returned as an array, sorted by score (relevance). The id field in each result contains the array index of the matching document.
To accelerate searches using WebGPU, use the following code. The usage after index creation remains the same as above.
import { GPULinearIndex, createIndex, search, StaticSeekError } from "staticseek";
const index = createIndex(GPULinearIndex, array_of_articles);
...If you experience performance issues, try using speed-optimized index.
import { HybridTrieBigramInvertedIndex, createIndex, search, StaticSeekError } from "staticseek";
const index = createIndex(HybridTrieBigramInvertedIndex, array_of_articles);
...Search Features
Query Syntax
- Fuzzy Search: Default behavior with configurable edit distance
distance:2 searchterm- allows 2 character edits
- Exact Match:
"exact phrase" - AND Search:
term1 term2 - OR Search:
term1 OR term2 - NOT Search:
-term1 term2 - Field-Specific:
from:title searchterm
Index Types
LinearIndex (Default)
- Best for small to medium-sized content
- Simple and reliable
- Good balance of performance and accuracy
GPULinearIndex
- WebGPU-accelerated fuzzy search
- 2-10x faster for larger datasets
- Gracefully falls back to LinearIndex when WebGPU is unavailable
HybridTrieBigramInvertedIndex
- ~100x faster search performance
- Ideal for larger datasets
- Trade-offs:
- Higher false positive rate for CJK-like languages
- Less precise fuzzy search for CJK-like languages
- Limited result metadata
Performance
By selecting the appropriate index type, typical search times can be kept within a few milliseconds. Search performance for a 4MB dataset (worst case scenario, slowest index type, approximately 100 articles):
- Exact Match: < 5ms
- Fuzzy Search: < 150ms
- Index Generation: ~1sec
- ~30sec for HybridTrieBigramInvertedIndex
For detailed benchmarks across different hardware configurations and index types, see the Benchmarks section of official website.
Integration with Static Site Generators
Example implementations are available for:
