npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

ozymandias_osiris

v1.0.15

Published

Vector search and document processing library

Readme

Osiris - Document Ingestion Pipeline

Osiris is a powerful document ingestion pipeline designed to process content into vector embeddings and store them in a Qdrant vector database. It's built to work seamlessly with the Ibis chat application.

Prerequisites

  • Node.js (v18 or higher)
  • npm or yarn
  • A Qdrant instance (local or cloud)
  • OpenAI API key

Installation

  1. Clone the repository:

    git clone <repository-url>
    cd osiris
    
  2. Install dependencies:

    npm install
  3. Create a .env file in the root directory:

OpenAI API key for embeddings generation

OPENAI_API_KEY=your_openai_api_key

Qdrant settings

QDRANT_URL=your_qdrant_url # e.g., http://localhost:6333 or your cloud URL QDRANT_API_KEY=your_qdrant_api_key # Optional for local, required for cloud

Preparing Your Data

Create JSON files containing your documents. Each JSON file should follow this structure:

{
  "title": "Document Title",
  "content": "Document content goes here...",
  "metadata": {
    "source": "optional source information",
    "author": "optional author information",
    "date": "optional date information"
  }
}

Store your JSON files in a directory (e.g., ./data).

Usage

  1. Build the project:

    npm run build
  2. Run the ingestion pipeline:

    node dist/index.js ingest <directory> [options]

Options:

  • --collection - Collection name (default: 'website_content')
  • --batch-size - Batch size for processing (default: 100)
  • --max-retries - Max retries for failed operations (default: 3)
  • --max-concurrent - Max concurrent operations (default: 5)

Example

node dist/index.js ingest ./data --collection documents --batch-size 50

or add this in .zshrc

# Osiris data ingestion function
osiris() {
  # Show help if no arguments provided
  if [ -z "$1" ] || [ "$1" = "-h" ] || [ "$1" = "--help" ]; then
    echo "Usage: osiris <command> [options]"
    echo ""
    echo "Commands:"
    echo "  ingest <directory> -c <collection> -g <group-id>  # Ingest content"
    echo "  health                                           # Check system health"
    echo "  clean <collection>                               # Clean collection"
    echo "  delete-by-group -c <collection> -g <group-id>    # Delete by group"
    echo ""
    echo "Examples:"
    echo "  osiris ingest ./content -c my-collection -g client1"
    echo "  osiris health"
    echo "  osiris clean my-collection"
    echo "  osiris delete-by-group -c my-collection -g client1"
    return 1
  fi

  # Get the directory of the script
  local OSIRIS_PATH="/users/ivan/sites/ozymandias/osiris"

  # If the first argument looks like a path and not a command, insert 'ingest'
  if [[ "$1" != "health" && "$1" != "clean" && "$1" != "delete-by-group" ]]; then
    set -- "ingest" "$@"
  fi

  # Run the command using tsx instead of node
  cd "$OSIRIS_PATH" && npx tsx src/index.ts "$@"
}

and then run

# Ingest content
osiris ./content -c collection-name -g client1

# Check health
osiris health

# Clean collection
osiris clean my-collection

# Delete by group
osiris delete-by-group -c my-collection -g client1

Features

  • Content Validation: Validates JSON files and their content structure
  • Text Chunking: Intelligently splits documents into appropriate chunks
  • Embedding Generation: Generates embeddings using OpenAI's API
  • Vector Storage: Stores embeddings in Qdrant vector database
  • Progress Tracking: Shows real-time progress and statistics
  • Error Handling: Robust error handling with retries
  • Concurrent Processing: Efficient parallel processing of documents

Monitoring

The ingestion process provides real-time feedback:

  • Progress of file processing
  • Number of chunks generated
  • Embedding generation progress
  • Success/failure statistics

Error Handling

Errors are logged with detailed information. Failed operations are automatically retried based on the --max-retries setting.

Integration with Ibis

Osiris is designed to work with the Ibis chat application. Make sure to:

  • Use the same Qdrant instance in both applications
  • Set the collection name to match Ibis's configuration (default: 'documents')

Development

Run tests:

npm run test

Watch mode:

npm run test:watch

Generate coverage report:

npm run coverage