npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@thaon/strapi-plugin-semantic-search

v1.0.6

Published

A Strapi plugin that adds semantic search capabilities using OpenRouter embeddings

Readme

Strapi Plugin Semantic Search

A Strapi plugin that adds semantic search capabilities using OpenRouter embeddings. This plugin enables intelligent content search that understands meaning and context, not just keyword matching.

Features

  • Semantic Search: Search content using embeddings for better relevance
  • Document Chunking: Automatically chunks large documents for optimal search results
  • Similarity Threshold: Configurable similarity threshold for search results
  • OpenRouter Integration: Uses OpenRouter's embedding models
  • Strapi 4 Compatible: Works with Strapi v4.x and v5.x

Installation

Prerequisites

  • Node.js >= 20
  • Strapi v4.x or v5.x
  • OpenRouter API key

Install the Plugin

# Using npm
npm install strapi-plugin-semantic-search

# Using yarn
yarn add strapi-plugin-semantic-search

Configure Environment Variables

Add the following environment variables to your .env file:

OPENROUTER_API_KEY=your_openrouter_api_key_here
OPENROUTER_MODEL=openai/text-embedding-3-small
SITE_URL=http://localhost:1337
SITE_NAME=YourSiteName

Enable the Plugin

Add the plugin to your config/plugins.js file:

module.exports = () => ({
  "semantic-search": {
    enabled: true,
    resolve: "@thaon/strapi-plugin-semantic-search",
  },
});

Usage

The plugin provides two main services: indexing and searching. Both must be called manually through your application code or API endpoints.

1. Index Documents

Before searching, you must manually index your documents using the indexer service.

Single Field Indexing

Index a specific field from a document:

// In your controller or service
const indexer = strapi.plugin("semantic-search").service("indexer");

await indexer.indexDocument(
  contentType, // e.g., "api::doc.doc"
  documentId, // Document ID to index
  field, // Field name to index (e.g., "content")
  titleField, // Optional: field to use as title (default: "title")
  ownerId // Required: user ID for ownership filtering
);

Parameters:

  • contentType (string): The content type UID (e.g., api::doc.doc)
  • documentId (string): The document ID to index
  • field (string): The field containing text to index (e.g., content)
  • titleField (string, optional): Field to use as document title reference (default: title)
  • ownerId (number): User ID for ownership-based filtering

Response:

{
  "success": true,
  "documentId": "123",
  "contentType": "api::doc.doc",
  "chunksCreated": 5
}

Multi-Field Indexing

Index multiple fields from a document:

const indexer = strapi.plugin("semantic-search").service("indexer");

await indexer.indexDocumentFields(
  contentType, // e.g., "api::doc.doc"
  documentId, // Document ID to index
  fields, // Array of field names (e.g., ["title", "content"])
  titleField, // Optional: field to use as title
  ownerId // Required: user ID for ownership filtering
);

Parameters:

  • contentType (string): The content type UID
  • documentId (string): The document ID to index
  • fields (string[]): Array of field names to combine and index
  • titleField (string, optional): Field to use as document title (default: title)
  • ownerId (number): User ID for ownership-based filtering

2. Perform Semantic Search

After indexing, search indexed documents using the search service.

const searchService = strapi.plugin("semantic-search").service("search");

const results = await searchService.querySearch(query, options);

Parameters:

  • query (string): The search query
  • options (object):
    • ownerId (number, required): User ID to filter results by ownership
    • limit (number, optional): Maximum results to return (default: 5)
    • threshold (number, optional): Similarity threshold 0-1 (default: 0.5)
    • contentType (string, optional): Filter by specific content type

Response:

[
  {
    "documentId": "123",
    "title": "Document Title",
    "textSnippet": "Relevant excerpt from the document...",
    "fullContent": "Complete document content...",
    "contentType": "api::doc.doc",
    "score": 0.85
  }
]

3. API Endpoints Example

If you expose these services via API endpoints:

# Index a document
POST /api/semantic-search/index
Content-Type: application/json
{
  "contentType": "api::doc.doc",
  "documentId": "123",
  "field": "content",
  "ownerId": 1
}

# Search indexed documents
GET /api/semantic-search/search?query=your+search+query&limit=10&threshold=0.5

Configuration

The plugin can be configured through environment variables:

| Variable | Description | Default | | ---------------------- | ---------------------------- | ------------------------------- | | OPENROUTER_API_KEY | Your OpenRouter API key | Required | | OPENROUTER_MODEL | Embedding model to use | openai/text-embedding-3-small | | SITE_URL | Your Strapi site URL | http://localhost:1337 | | SITE_NAME | Your site name | StrapiSemanticSearch | | CHUNK_SIZE | Document chunk size | 1000 | | CHUNK_OVERLAP | Chunk overlap size | 150 | | SIMILARITY_THRESHOLD | Default similarity threshold | 0.5 |

How It Works

Indexing Process

The plugin provides two main indexing functions:

  1. Document Indexing (indexDocument):

    • Retrieves a specific document by content type and ID
    • Extracts text content from specified field (e.g., 'content')
    • Splits text into chunks using configurable chunk size and overlap
    • Generates vector embeddings for each chunk using OpenRouter
    • Stores chunks with metadata in plugin::semantic-search.chunk table
    • Links chunks to parent document with ownership filtering
  2. Multi-Field Indexing (indexDocumentFields):

    • Combines multiple fields from a document into a single text string
    • Processes the combined text through the same chunking and embedding pipeline
    • Useful for indexing title, content, description, etc. together

Search Process

The querySearch function performs semantic search using these steps:

  1. Query Vectorization: Converts the search query into a vector embedding
  2. Chunk Retrieval: Fetches stored chunks filtered by owner ID and optionally by content type
  3. Similarity Calculation: Computes cosine similarity between query vector and all chunk embeddings
  4. Threshold Filtering: Removes results below the similarity threshold (default: 0.7)
  5. Deduplication: Groups results by document ID, keeping the highest-scoring chunk per document
  6. Full Content Retrieval: For each unique document, fetches the full document content from api::doc.doc table
  7. Ranking: Returns results sorted by similarity score with configurable limit

Data Flow

  • Input: User search query + optional filters (content type, limit, threshold)
  • Processing: Vector embedding → cosine similarity → threshold filtering → deduplication
  • Output: Array of documents with full content, titles, snippets, and similarity scores

Example Response

{
  "data": [
    {
      "id": 1,
      "attributes": {
        "title": "Article Title",
        "content": "Article content...",
        "similarity": 0.85
      }
    }
  ],
  "meta": {
    "total": 1,
    "threshold": 0.7
  }
}

Development

Local Development

# Clone the repository
git clone https://github.com/thaon/strapi-plugin-semantic-search.git

# Install dependencies
cd strapi-plugin-semantic-search
npm install

# Link for local development
npm link

Testing

# Run tests
npm test

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Support

If you encounter any issues or have questions:

  1. Check the Issues page
  2. Create a new issue with detailed information
  3. Include your Strapi version and plugin version

Changelog

v1.0.0

  • Initial release
  • Basic semantic search functionality
  • OpenRouter integration
  • Configurable chunking and similarity threshold