ship_specific_manuals

v1.2.0

Published

8 months ago

Model Context Protocol (MCP) server for Typesense with image support and compression

0High
0Medium
0Low

uhbarp-ayis

mcp typesense search model-context-protocol semantic-search vector-search image-compression claude ai

ship_specific_manuals

A Model Context Protocol (MCP) server for ship-specific manuals stored in Typesense with built-in image support and automatic compression. This server enables AI assistants like Claude to perform semantic search, hybrid search, and retrieve images from ship manual collections.

Features

🔍 Semantic/Vector Search - Natural language queries using embeddings
🔀 Hybrid Search - Combines text and vector search for best results
🖼️ Image Support - Automatic fetching and base64 encoding of images
📦 Image Compression - Auto-compresses images to max 150KB using Sharp with smart PNG-to-JPEG conversion
🔗 Document Links - Navigation tools return document links for easy access
📏 1MB Response Limit - Automatically fits maximum content within MCP protocol limits
🎯 Filtering - Filter by document, chapter, section, breadcrumb path, and more
📊 Document Structure - Get hierarchical organization of documents
⚡ Text-Only Search - Fast search workflow without images for better performance
⚙️ Flexible Configuration - JSON config file or environment variables

Installation

Via NPM (Recommended)

npm install -g ship_specific_manuals

From Source

git clone https://github.com/yourusername/mcp-typesense-server.git
cd mcp-typesense-server
npm install
npm link

Configuration

Method 1: MCP Client with npx (Recommended)

Configure environment variables directly in your MCP client config using npx to run the latest version automatically.

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "ship_specific_manuals": {
      "command": "npx",
      "args": [
        "-y",
        "ship_specific_manuals@latest"
      ],
      "env": {
        "TYPESENSE_HOST": "your-typesense-host.com",
        "TYPESENSE_PORT": "443",
        "TYPESENSE_PROTOCOL": "https",
        "TYPESENSE_API_KEY": "your-api-key-here",
        "TYPESENSE_COLLECTION": "Ship_Manuals"
      }
    }
  }
}

Benefits:

✅ No global installation needed
✅ Always uses the latest version with @latest
✅ -y flag auto-confirms installation

Method 2: Configuration File (Alternative)

Create a config.json file in the same directory as the server:

{
  "typesense": {
    "host": "your-typesense-host.com",
    "port": "443",
    "protocol": "https",
    "apiKey": "your-api-key-here",
    "collection": "Ship_Manuals"
  },
  "documents": [
    {
      "name": "Machinery Manual",
      "header": "MACHINERY OUTFITTING PART",
      "count": 150
    }
  ]
}

Note: Environment variables from MCP client take priority over config.json

Quick Start with Claude Desktop

Add the configuration from Method 1 to your claude_desktop_config.json
Replace the placeholder values with your actual Typesense credentials
Save the file
Restart Claude Desktop
The ship manuals tools will be automatically available

You can verify by asking Claude: "What MCP tools are available?"

Example Configuration:

{
  "mcpServers": {
    "ship_specific_manuals": {
      "command": "npx",
      "args": ["-y", "ship_specific_manuals@latest"],
      "env": {
        "TYPESENSE_HOST": "j51ouydaces0i2m7p-1.a1.typesense.net",
        "TYPESENSE_PORT": "443",
        "TYPESENSE_PROTOCOL": "https",
        "TYPESENSE_API_KEY": "your-actual-key-here",
        "TYPESENSE_COLLECTION": "Ship_Manuals"
      }
    }
  }
}

Available Tools

Search Tools

1. semantic_search

Perform semantic/vector search using natural language queries. Returns text content and images for matching chunks. Images are automatically compressed to max 150KB. Response automatically fits within 1MB limit by including as many complete results as possible.

Parameters:

query (required): Natural language search query
limit (optional): Number of results (default: 5)
breadcrumb_path (optional): Filter by breadcrumb hierarchy path
document_name (optional): Filter by specific document name
document_header (optional): Filter by specific document header
chapter (optional): Filter by chapter
section (optional): Filter by section

Example:

Search for "cargo tank valve maintenance procedures"

2. hybrid_search

Combines semantic and keyword search with filtering options. Returns text content and images for matching chunks. Images are automatically compressed to max 150KB. Response automatically fits within 1MB limit by including as many complete results as possible.

Parameters:

query (required): Search query
limit (optional): Number of results (default: 5)
breadcrumb_path (optional): Filter by breadcrumb hierarchy path
document_name (optional): Filter by specific document name
document_header (optional): Filter by document header
chapter (optional): Filter by chapter
section (optional): Filter by section

3. search_text_only

Fast text-based search without images. Returns more results without hitting 1MB limit. Use this first to find relevant chunks, then use get_chunk_images to fetch specific images.

Parameters:

query (required): Search query
search_type (optional): 'semantic' or 'hybrid' (default: hybrid)
limit (optional): Number of results (default: 10)
breadcrumb_path (optional): Filter by breadcrumb path
document_name (optional): Filter by document name
document_header (optional): Filter by document header
chapter (optional): Filter by chapter
section (optional): Filter by section

Returns: Text results with sourceId for each chunk

4. get_chunk_images

Fetch compressed images for specific chunks identified from search_text_only results. Each image compressed to max 150KB.

Parameters:

source_ids (required): Array of sourceId values from search results (max 5)

Example workflow:

1. Use search_text_only to find relevant content (returns 10-20 results with sourceId)
2. Review text results and identify most relevant chunks
3. Use get_chunk_images with selected sourceIds to fetch images

5. filter_by_document

Filter and retrieve document chunks by documentName, chapter, and section. Returns actual content chunks with text and metadata.

Parameters:

document_name (optional): Filter by document name
chapter (optional): Filter by chapter
section (optional): Filter by section
limit (optional): Results per page (default: 5)
page (optional): Page number (default: 1)

Navigation Tools

Recommended workflow for large collections (28K+ documents):

Start with get_collection_tree - Get overview of hierarchy

→ Shows: ["Ship Manuals" → "Globe Polaris" → "INSTRUCTION BOOKS" → ...]

Navigate with browse_collection - Explore a specific path

breadcrumb_path: ["Ship Manuals", "Globe Polaris", "INSTRUCTION BOOKS;S.NO.3131"]
→ Shows: next breadcrumb levels + documents at exact level (with names and links)

Pick a document - Use list_document_names or navigate deeper
```
→ Get: document_name and documentLink
```

List chapters - list_document_chapters

document_name: "Operation and Maintenance Manual"
→ Shows: All unique chapters in this document

List sections - list_chapter_sections

document_name: "Operation and Maintenance Manual"
chapter: "7. MAINTENANCE"
→ Shows: All unique sections in this chapter

Search for content - Use semantic_search or hybrid_search with filters

query: "valve maintenance procedure"
document_name: "Operation and Maintenance Manual"
chapter: "7. MAINTENANCE"
section: "7.1 Preventive Maintenance"
→ Returns: Actual content chunks with images (automatically fits within 1MB limit)

Alternative: Text-only search workflow

1. Use search_text_only to get 10-20 text results with sourceId
2. Review results and select most relevant chunks
3. Use get_chunk_images to fetch images for specific sourceIds

This hierarchical workflow prevents overwhelming the LLM with thousands of chunks!

6. list_document_names

List unique document names with their document links, optionally filtered by breadcrumb path. For large collections (1000s of documents), always filter by breadcrumb first to avoid overwhelming results.

Parameters:

breadcrumb_path (optional): Array of breadcrumb elements to filter by
limit (optional): Maximum number of unique document names (default: returns all, up to 50000)

Example:

Filter by breadcrumb: ["Ship Manuals", "Globe Polaris", "INSTRUCTION BOOKS;S.NO.3131", "MACHINERY OUTFITTING PART;S.NO.3131"]
This shows only documentNames under that specific path.

Example Response:

{
  "breadcrumb_filter": ["Ship Manuals", "Globe Polaris"],
  "total_unique_documents": 5,
  "documents": [
    {
      "name": "Operation and Maintenance Manual",
      "documentLink": "https://example.com/manual.pdf",
      "count": 125
    },
    {
      "name": "Safety Manual",
      "documentLink": "https://example.com/safety.pdf",
      "count": 45
    }
  ]
}

7. list_document_chapters

For a given documentName, list all unique chapters. Returns ALL chapters by default (up to 50k).

Parameters:

document_name (required): The document name to get chapters for
limit (optional): Limit number of results (default: returns all, up to 50000)

Example:

{
  "document_name": "Operation and Maintenance Manual",
  "total_chapters": 15,
  "chapters": [
    {"chapter": "Chapter 1: Safety Procedures", "count": 45},
    {"chapter": "Chapter 2: Engine Maintenance", "count": 67}
  ]
}

8. list_chapter_sections

For a given documentName and chapter, list all unique sections within that chapter. Returns ALL sections by default (up to 50k).

Parameters:

document_name (required): The document name
chapter (required): The chapter name to get sections for
limit (optional): Limit number of results (default: returns all, up to 50000)

Example:

{
  "document_name": "Operation and Maintenance Manual",
  "chapter": "7. MAINTENANCE",
  "total_sections": 8,
  "sections": [
    {"section": "7.1 Preventive Maintenance", "count": 12},
    {"section": "7.2 Corrective Maintenance", "count": 8}
  ]
}

9. browse_collection

Explore what's INSIDE a breadcrumb path - shows next-level breadcrumb elements and available documents with links. Does NOT return document chunks to avoid overwhelming the LLM context.

Parameters:

breadcrumb_path (required): Array of breadcrumb elements to explore
document_name (optional): If provided, shows documentHeaders for that specific document

Behavior:

Without document_name:

breadcrumb_path: ["Ship Manuals", "Globe Polaris", "INSTRUCTION BOOKS;S.NO.3131"]

Response:
{
  "next_breadcrumb_levels": ["MACHINERY OUTFITTING PART;S.NO.3131", "HULL OUTFITTING PART;S.NO.3131"],
  "documents_at_exact_level": [
    {
      "name": "Operation Manual",
      "documentLink": "https://example.com/manual.pdf"
    },
    {
      "name": "Safety Manual",
      "documentLink": "https://example.com/safety.pdf"
    }
  ],
  "total_unique_documents": 45
}

With document_name:

breadcrumb_path: ["Ship Manuals", "Globe Polaris", ...]
document_name: "Operation Manual"

Response:
{
  "document_headers": ["Chapter 1: Introduction", "Chapter 2: Safety", ...],
  "note": "Use semantic_search or hybrid_search with filters to get actual content"
}

10. get_collection_tree

Get the complete hierarchical folder tree structure for exploration. Shows all navigation paths available.

Parameters:

document_name (optional): Filter by specific document name
max_depth (optional): Maximum depth of tree (default: 3)

Example Response:

{
  "tree": [
    {
      "name": "MACHINERY",
      "count": 450,
      "children": [
        {"name": "Chapter 1", "count": 120, "children": [...]}
      ]
    }
  ]
}

Utility Tools

11. get_document_structure

Get hierarchical structure (chapters, sections) of a documentName. Shows nested chapter → sections structure in one view.

Parameters:

document_name (required): Document name to analyze

Example Response:

{
  "document_name": "Operation and Maintenance Manual",
  "structure": [
    {
      "chapter": "7. MAINTENANCE",
      "sections": ["7.1 Preventive", "7.2 Corrective"]
    }
  ]
}

12. get_collection_schema

Get the Typesense collection schema.

Image Handling

The server automatically:

Fetches images from URLs in the imageLink field
Hard limit: 150KB per image - All images are compressed to max 150KB to fit multiple results within 1MB response limit
1MB Response Limit - Automatically includes as many complete results (with all their images) as fit within the MCP protocol's 1MB limit
Smart compression strategy:
- Detects PNG images and converts to JPEG immediately (PNG doesn't compress well)
- For other formats, tries to preserve original format first
- Progressively reduces dimensions and quality based on size ratio
- Uses mozjpeg for maximum JPEG compression efficiency
- Falls back to aggressive JPEG compression if needed
Maintains aspect ratio during resizing (max 1200px starting dimension)
Encodes as base64 for MCP protocol
Image context labels: Each image is preceded by metadata showing which result it belongs to (Result #N, document name, chapter, section, page, document link)
Detailed compression logs to stderr for debugging

Two-Step Workflow for Maximum Results:

Use search_text_only to get 10-20 text results without images (fast, no 1MB limit)
Use get_chunk_images to fetch images for specific chunks identified in step 1

Development

# Clone the repository
git clone https://github.com/yourusername/mcp-typesense-server.git
cd mcp-typesense-server

# Install dependencies
npm install

# Run in development mode
npm run dev

# Start the server
npm start

Requirements

Node.js >= 18.0.0
Typesense server with vector search enabled
Collection with embedding field configured

Typesense Collection Schema

Your Typesense collection should include:

{
  "name": "Ship_Manuals",
  "fields": [
    {"name": "embedding", "type": "float[]", "num_dim": 1536},
    {"name": "embText", "type": "string"},
    {"name": "documentHeader", "type": "string", "facet": true},
    {"name": "chapter", "type": "string", "facet": true},
    {"name": "section", "type": "string", "facet": true},
    {"name": "imageLink", "type": "string[]", "optional": true},
    {"name": "pageNumber", "type": "int32"},
    {"name": "shortSummary", "type": "string", "optional": true}
  ]
}

Publishing to NPM

# Login to NPM
npm login

# Publish (first time)
npm publish --access public

# Publish updates
npm version patch  # or minor, or major
npm publish

License

MIT

Contributing

Contributions are welcome! Please open an issue or submit a pull request.

Support

For issues and questions:

GitHub Issues: https://github.com/yourusername/mcp-typesense-server/issues

Acknowledgments

Built with Model Context Protocol SDK
Powered by Typesense
Image processing by Sharp

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

ship_specific_manuals

Features

Installation

Via NPM (Recommended)

From Source

Configuration

Method 1: MCP Client with npx (Recommended)

Method 2: Configuration File (Alternative)

Quick Start with Claude Desktop

Available Tools

Search Tools

1. semantic_search

2. hybrid_search

3. search_text_only

4. get_chunk_images

5. filter_by_document

Navigation Tools

6. list_document_names

7. list_document_chapters

8. list_chapter_sections

9. browse_collection

10. get_collection_tree

Utility Tools

11. get_document_structure

12. get_collection_schema

Image Handling

Development

Requirements

Typesense Collection Schema

Publishing to NPM

License

Contributing

Support

Acknowledgments