npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

@gherk/requirements-extractor

v1.2.0

Published

MCP server that extracts, classifies and generates structured requirements from PDF documents using Ollama LLM

Readme

@gherk/requirements-extractor

An MCP (Model Context Protocol) server that extracts, classifies, and generates structured requirement documents from PDF files using a local Ollama LLM.

Features

  • 3-Pass AI Pipeline — Identify → Classify → Generate requirements from any PDF
  • Async Processing — Returns a job ID immediately; poll for incremental results
  • 6 Categories — Backend, Frontend, Mobile, Infrastructure, DevOps, Non-Functional
  • Structured Output — Generates individual markdown files and/or JSON per requirement
  • Crash-Resilient — Files are written incrementally during generation (survives interruptions)
  • Local LLM — Powered by Ollama (default: qwen3:8b), no external API keys required

Prerequisites

  • Node.js ≥ 18

  • Ollama running locally with a model pulled:

    ollama pull qwen3:8b

Installation

npx -y @gherk/requirements-extractor

Or install globally:

npm install -g @gherk/requirements-extractor

MCP Configuration

Add to your MCP client configuration (e.g., mcp-servers.json):

{
  "name": "requirements-extractor",
  "command": "npx",
  "args": ["-y", "@gherk/requirements-extractor"],
  "enabled": true
}

Tools

extract_requirements

Starts an asynchronous extraction job. Returns a jobId immediately.

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | pdfPath | string | ✅ | — | Absolute path to the PDF file | | outputDir | string | ✅ | — | Absolute path to the output directory | | projectPrefix | string | ❌ | "ACS" | Prefix for requirement IDs (e.g., ACS_BE_01) | | outputFormat | "markdown" \| "json" \| "both" | ❌ | "markdown" | Output format | | filterCategories | string[] | ❌ | all | Only generate for these categories | | model | string | ❌ | "qwen3:8b" | Ollama model to use | | ollamaUrl | string | ❌ | "http://localhost:11434" | Ollama API URL |

Response:

{
  "jobId": "a1b2c3d4",
  "status": "running",
  "message": "Extraction started. Use get_extraction_progress to poll for results."
}

get_extraction_progress

Polls a running job for incremental results.

| Parameter | Type | Required | Default | Description | |-----------|------|----------|---------|-------------| | jobId | string | ✅ | — | Job ID from extract_requirements | | since | number | ❌ | 0 | Return requirements generated after this index |

Response:

{
  "jobId": "a1b2c3d4",
  "status": "running",
  "phase": "Pass 3: Generating markdown...",
  "totalIdentified": 42,
  "totalToGenerate": 38,
  "done": 12,
  "finished": false,
  "new": [
    {
      "id": "ACS_BE_01",
      "title": "User Authentication",
      "category": "backend",
      "category_label": "Backend",
      "type": "Funcional",
      "page_range": "5-7",
      "description": "JWT-based authentication system...",
      "acceptance_criteria": ["Given a valid token...", "..."]
    }
  ]
}

list_jobs

Lists all extraction jobs (running and completed). Takes no parameters.

Response:

[
  { "id": "a1b2c3d4", "status": "completed", "phase": "Done", "done": 38, "total": 38 },
  { "id": "e5f6g7h8", "status": "running", "phase": "Pass 2: Classifying...", "done": 0, "total": 0 }
]

list_requirements

Lists all generated requirement files in a directory, organized by category.

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | requirementsDir | string | ✅ | Absolute path to the requirements directory |


get_requirement

Returns the full content of a specific requirement file.

| Parameter | Type | Required | Description | |-----------|------|----------|-------------| | requirementsDir | string | ✅ | Absolute path to the requirements directory | | requirementId | string | ✅ | Requirement ID (e.g., BE_01) or filename (e.g., ACS_BE_01_auth.md) |

Pipeline Architecture

PDF → Parse (5-page chunks)
  ↓
Pass 1: Identify — LLM extracts raw requirements from each chunk
  ↓
Pass 2: Classify — LLM assigns category (backend/frontend/mobile/infra/devops/nf)
  ↓
Sequence — Assign IDs: {PREFIX}_{CATEGORY}_{SEQ} (e.g., ACS_BE_01)
  ↓
Pass 3: Generate — LLM generates structured markdown per requirement
  ↓
Write — Each file written immediately (crash-resilient)
  ↓
Summary — README.md + requirements.json in output directory

Categories

| ID | Prefix | Label | Description | |----|--------|-------|-------------| | backend | BE | Backend | APIs, services, business logic, integrations | | frontend | FE | Frontend | Web interfaces, dashboards, forms, portals | | mobile | MB | Mobile | iOS/Android apps, Bluetooth, GPS, push notifications | | infra | IF | Infrastructure | Databases, schemas, storage, message queues, caching | | devops | DO | DevOps | CI/CD, environments, containers, monitoring | | non-functional | NF | Non-Functional | Security, performance, scalability, SLAs, GDPR |

Output Structure

outputDir/
├── backend/
│   ├── ACS_BE_01_user_authentication.md
│   └── ACS_BE_02_payment_api.md
├── frontend/
│   └── ACS_FE_01_dashboard.md
├── mobile/
│   └── ACS_MB_01_push_notifications.md
├── infra/
├── devops/
├── non-functional/
│   └── ACS_NF_01_gdpr_compliance.md
├── README.md          # Summary with requirement counts
└── requirements.json  # All requirements as structured JSON

Gherk Integration

When used within the Gherk platform, the MCP is called through a multi-layer async pipeline:

Frontend (ImporterComponent)
  │  POST /tickets/analyze-requirements (file upload)
  ▼
server-go (RequirementsHandler)
  │  Returns { jobId } immediately
  │  Spawns goroutine →
  ▼
agent-go (/ai/extract-requirements)
  │  Calls MCP tool: extract_requirements
  │  MCP returns { jobId } → agent-go polls get_extraction_progress
  │  When done, reads requirements.json from outputDir
  ▼
agent-go → maps MCP output to AnalyzedRequirementV2:
  │  • category → side (frontend/backend/both)
  │  • Adds priority, roles, acceptanceCriteria
  ▼
server-go → broadcasts via WebSocket:
  │  • requirements_progress (phase updates)
  │  • requirements_complete (final results)
  │  • requirements_failed (errors)
  ▼
Frontend (ImporterStore) — receives results, shows preview

Category-to-Side Mapping

| MCP Category | Gherk Side | |-------------|------------| | frontend | frontend | | mobile | frontend | | backend | backend | | infra | both | | devops | both | | non-functional | both |

Key Files

| File | Description | |------|-------------| | agent-go/extract_requirements_handler.go | Calls MCP, reads requirements.json, maps to Gherk format | | agent-go/internal/mcp/mcp-servers.json | MCP registration (timeoutSeconds: 0) | | server-go/.../requirements.go | Async handler, WebSocket broadcasting | | desktop-app/.../importer.store.ts | Frontend state (survives navigation) |

Development

# Clone and install
npm install

# Build
npm run build

# Watch mode
npm run dev

# Run locally
npm start

License

MIT