@dolusoft/hirebase-mcp

v1.4.0

Published

4 months ago

HireBase - AI-powered CV search engine with LanceDB and MCP

0High
0Medium
0Low

zahidzorbaz

sametdemir

beyz

mcp mcp-server rag lancedb cv resume recruitment hiring ai vector-search semantic-search openai embeddings ocr tesseract

HireBase

AI-powered recruitment platform built as an MCP Server. Parses CVs, extracts structured data with GPT, stores them in a LanceDB vector database, and manages the entire hiring pipeline — from candidate matching to interview scheduling.

Architecture

DDD + Clean Architecture with clear separation of concerns:

src/
├── domain/           # Entities, Value Objects, Repository Interfaces
├── application/      # Use Cases, Ports, DTOs
├── infrastructure/   # LanceDB, OpenAI, PDF/DOCX/OCR Parsers
├── interface/        # MCP Server, Tools, Dashboard, CLI
└── shared/           # Errors, Types, Utilities

Design Principle: MCP Server = Data layer (parse, store, search, return). The AI client using the tools handles analysis, ranking strategy, and match explanations.

Tech Stack

| Technology | Purpose | |---|---| | TypeScript 5.9 | Language | | LanceDB | Vector database (embedded, serverless) | | OpenAI text-embedding-3-small | 1536-dim embeddings | | OpenAI gpt-5-mini | Structured CV extraction (Responses API) | | MCP SDK | Model Context Protocol server | | Nuxt 4 + Nuxt UI | Real-time dashboard | | unpdf | PDF text extraction | | mammoth | DOCX text extraction | | Tesseract.js | OCR for image-based PDFs |

MCP Tools (38)

CV Management

| Tool | Description | |---|---| | add_cv | Parse CV file (PDF/DOCX), extract structured data, embed, store | | update_cv | Update existing CV, archive old version | | delete_cv | Delete candidate and all related data | | get_cv_detail | Full candidate data with download URL | | get_cv_chunks | CV section chunks with optional section filter | | get_cv_versions | Version history | | list_cvs | Paginated candidate list with sorting | | manage_tags | Add/remove/list tags | | check_duplicate_candidate | Check if a candidate already exists by name or email before importing |

Search & Matching

| Tool | Description | |---|---| | semantic_search | Vector similarity search with optional section filter | | filter_candidates | Structured filtering (skills, location, experience, languages, tags) | | match_candidates | Composite scoring: semantic similarity + skill overlap + loyalty factor | | compare_candidates | Compare 2–5 candidates side by side, optionally with job skill match scores | | export_results | Export as JSON or CSV |

Job Posting Management

| Tool | Description | |---|---| | add_job_posting | Create job posting with required/preferred skills and metadata | | list_job_postings | List all job postings | | get_job_posting | Retrieve specific job posting details | | update_job_posting | Update existing job posting | | delete_job_posting | Delete a job posting |

Reference CVs

| Tool | Description | |---|---| | add_reference_cv | Mark a candidate as a reference/benchmark CV for a job posting | | list_reference_cvs | List all reference CVs for a job posting | | remove_reference_cv | Remove a reference CV from a job posting |

Recruitment Pipeline

| Tool | Description | |---|---| | add_to_pipeline | Add candidate to a job posting pipeline | | update_pipeline_status | Update application status | | add_pipeline_note | Add notes, call attempts, or interview records | | set_pending_action | Track pending actions with due dates and owner (us/candidate) | | get_pipeline | Full pipeline view for a job posting | | get_pipeline_candidates | Lightweight list of candidates in a pipeline (ID, name, status only) | | get_candidate_history | Candidate's recruitment history across all postings | | get_pending_actions | All overdue and upcoming pending actions |

Pipeline statuses: new → contacted → unreachable · not_interested · interview_scheduled → no_show · interviewed → rejected · offer_sent → hired

Screening

| Tool | Description | |---|---| | get_screening_history | Shortlisted, screened out, and in-pipeline candidates for a job posting | | generate_screening_report | Summary report with statistics, shortlisted, screened out, and pending candidates |

Call Batches

| Tool | Description | |---|---| | create_call_batch | Create a call batch: assign candidates to a person for calling | | get_call_batch_status | Status of a call batch: candidates, phone numbers, and call outcomes | | update_call_outcome | Update call outcome: pending, reached, or unreachable |

System

| Tool | Description | |---|---| | get_stats | Database statistics (candidates, skills, locations, etc.) | | get_server_status | Server version, dashboard URL, connected clients, config | | reset_database | Drop and recreate all tables |

How It Works

CV Ingestion

CV File (PDF/DOCX/Scanned PDF)
    │
    ▼
[1] Parse text (unpdf / mammoth / Tesseract.js OCR)
    │
    ▼
[2] Extract structured data (GPT-5 mini)
    │  name, experience, education, skills, languages...
    ▼
[3] Chunk into sections
    │  summary, each experience, education, skills, projects, certifications, full text
    ▼
[4] Generate embeddings (text-embedding-3-small, 1536d)
    │
    ▼
[5] Store in LanceDB (candidates + cv_chunks + cv_versions)

Candidate Matching

match_candidates uses a composite scoring algorithm:

compositeScore = (0.5 × vectorScore) + (0.3 × skillOverlap) + (0.2 × loyaltyFactor)

| Component | Weight | Description | |---|---|---| | Vector Score | 50% | Semantic similarity between CV and job description | | Skill Overlap | 30% | Ratio of matched required skills to total required skills | | Loyalty Factor | 20% | Candidate stability: stable = 1.0, moderate = 0.7, flight_risk = 0.4 |

Recruitment Pipeline

Job Posting
    │
    ▼
[1] match_candidates → ranked candidate list
    │
    ▼
[2] add_to_pipeline → create applications (status: new)
    │
    ▼
[3] update_pipeline_status → track progress through hiring stages
    │  add_pipeline_note → log calls, interviews, notes
    │  set_pending_action → assign follow-up tasks
    ▼
[4] get_pending_actions → monitor overdue/upcoming items

Dashboard

HireBase includes a real-time dashboard built with Nuxt 4 + Nuxt UI + Tailwind CSS. The dashboard connects via WebSocket and provides live tracking of MCP tool executions.

Enable via environment variable:

DASHBOARD_ENABLED=true
DASHBOARD_PORT=61496

The dashboard URL is available in get_server_status output.

Database Schema

LanceDB tables:

| Table | Purpose | |---|---| | candidates | Candidate profiles (name, email, skills, experience, loyalty score, tags) | | cv_chunks | Searchable CV sections with 1536-dim vector embeddings | | cv_versions | Archived CV versions for update history | | job_postings | Job listings with required/preferred skills and status | | applications | Pipeline records linking candidates to jobs with status tracking | | application_events | Audit trail: status changes, call attempts, interview records |

Installation

Prerequisites

Node.js >= 22 (download)
OpenAI API Key — required for CV extraction and embeddings. Get one at platform.openai.com/api-keys

Important: HireBase will not start without a valid OPENAI_API_KEY. If you see OPENAI_API_KEY environment variable is required, make sure the key is set in your MCP configuration (see below).

Claude Code (recommended)

No global install needed — npx downloads and runs the latest version automatically:

claude mcp add hirebase -s user \
  -e OPENAI_API_KEY=sk-your-key \
  -e DASHBOARD_ENABLED=true \
  -- npx -y @dolusoft/hirebase-mcp

That's it. Restart Claude Code and HireBase tools will be available.

Claude Desktop

Add to your config file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "hirebase": {
      "command": "npx",
      "args": ["-y", "@dolusoft/hirebase-mcp"],
      "env": {
        "OPENAI_API_KEY": "sk-your-key",
        "DASHBOARD_ENABLED": "true"
      }
    }
  }
}

Other MCP Clients

Any MCP-compatible client can use HireBase. The key requirement is passing OPENAI_API_KEY as an environment variable:

# Run directly
OPENAI_API_KEY=sk-your-key npx -y @dolusoft/hirebase-mcp

# Or install globally
npm install -g @dolusoft/hirebase-mcp
OPENAI_API_KEY=sk-your-key hirebase-mcp

From Source

git clone https://github.com/dolusoft/hirebase.git
cd hirebase
pnpm install
pnpm build
cp .env.example .env  # edit .env and add your OPENAI_API_KEY
pnpm start

Environment Variables

| Variable | Default | Description | |---|---|---| | OPENAI_API_KEY | — | Required. OpenAI API key for embeddings and CV extraction | | LANCEDB_PATH | ./data/lancedb | LanceDB storage path (relative to CWD) | | EMBEDDING_MODEL | text-embedding-3-small | OpenAI embedding model | | EXTRACTION_MODEL | gpt-5-mini | OpenAI model for structured CV extraction | | DASHBOARD_ENABLED | true | Enable real-time dashboard | | DASHBOARD_PORT | 0 (random) | Dashboard server port (0 = auto-assign) |

Key Design Decisions

LanceDB has no UPDATE — update is implemented as delete + add at repository level
Complex fields stored as JSON strings — skills, experience, etc. serialized in Utf8 columns
Batch embedding — all chunks embedded in a single OpenAI API call
Seed rows — tables use __seed__ rows for schema inference, filtered out in queries
Versioning — old CV data archived before updates, full history accessible
Section-level chunking — each work experience is a separate chunk for granular matching
OCR fallback — image-based PDFs automatically processed with Tesseract.js
Composite scoring — matching combines semantic, skill-based, and loyalty signals

License

MIT

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

HireBase

Architecture

Tech Stack

MCP Tools (38)

CV Management

Search & Matching

Job Posting Management

Reference CVs

Recruitment Pipeline

Screening

Call Batches

System

How It Works

CV Ingestion

Candidate Matching

Recruitment Pipeline

Dashboard

Database Schema

Installation

Prerequisites

Claude Code (recommended)

Claude Desktop

Other MCP Clients

From Source

Environment Variables

Key Design Decisions

License