agentforge-clinical-agent
v1.0.0
Published
LangChain.js clinical AI agent for OpenEMR — 10 tools, verification layer, Langfuse observability, 125 eval cases
Downloads
108
Maintainers
Readme
OpenEMR Clinical Query Agent
AI-powered clinical query agent for OpenEMR. Handles discharge summaries, medication reconciliation, drug interactions, and patient-friendly discharge instructions — all via natural language. Built for the AgentForge / Gauntlet AI bounty.
Live Demo: https://agent-production-6f7a.up.railway.app
Architecture
Agent: LangChain.js + Claude Sonnet 4 using createToolCallingAgent with native tool-calling. The agent reasons over clinical queries, selects from 10 tools, executes multi-step workflows (up to 10 iterations), and synthesizes results with source attribution and safety verification.
10 Tools:
| Tool | Purpose |
|------|---------|
| get_patient_summary | Demographics, conditions, medications, allergies, vitals |
| get_medications | Active medication list with dose, frequency, prescriber |
| drug_interaction_check | Pairwise interaction check with severity gating |
| allergy_check | Direct + cross-reactivity allergy matching |
| get_lab_results | Lab values with normal/abnormal/critical flags |
| get_encounter_data | Hospital encounters, diagnoses, procedures, course notes |
| reconcile_medications | Admission vs. discharge medication comparison |
| draft_discharge_summary | Multi-source aggregation into structured clinician-facing summary |
| generate_discharge_instructions | Patient-facing instructions + DailyMed education + appointments |
| save_to_chart | Stateful document CRUD with draft/finalize workflow |
Data Sources:
- Mock JSON or OpenEMR FHIR R4 — Patient data via configurable
DATA_SOURCEenv var - DailyMed REST API (NLM/NIH) — FDA-approved drug labeling for patient education
- OpenFDA — Drug interaction label data
Verification Layer: Post-LLM safety checks on every response — drug interaction severity gate, allergy conflict detection, critical lab flagging, medication change alerts, source attribution, and medical disclaimer.
Observability: Langfuse tracing with per-request spans, session grouping, and feedback correlation.
See docs/ARCHITECTURE.md for the full architecture documentation.
Demo
Patient Selection + Quick Prompts
Select a patient and use quick-prompt buttons for common clinical queries.

Drug Interaction Check
Severity-gated interaction analysis with clinical significance, monitoring requirements, and recommended actions.

Medication Reconciliation
Compare admission vs. discharge medications — flags continued, modified, new, and discontinued meds with reasons.

Discharge Summary — Draft, Edit, Finalize
AI drafts a comprehensive discharge summary. Practitioners review, edit, and finalize before saving to chart.

Discharge Instructions + Scheduled Appointments
Patient-friendly instructions with DailyMed drug education, warning signs, and actual scheduled follow-up appointments.

Eval Results
79 eval cases across 17 categories — 87.3% pass rate (69/79) on all 10 tools. p50 latency: 7.2s, p95: 28.7s.
| Category | Passed | Total | Rate | |----------|--------|-------|------| | Golden Sets | 25 | 25 | 100% | | Query Variation | 8 | 8 | 100% | | Drug Interactions | 5 | 5 | 100% | | Complex Queries | 4 | 4 | 100% | | DailyMed | 2 | 2 | 100% | | Workflows | 3 | 3 | 100% | | Bounty: Med Rec | 2 | 2 | 100% | | Bounty: Discharge | 2 | 2 | 100% | | Bounty: Workflows | 2 | 2 | 100% | | Bounty: Safety | 2 | 2 | 100% | | Safety | 4 | 5 | 80% | | Discharge Instructions | 3 | 4 | 75% | | Appointments | 2 | 3 | 67% | | Bounty: Encounters | 2 | 3 | 67% | | Edge Cases | 2 | 4 | 50% | | Adversarial | 1 | 4 | 25% |
See evals.md for the full eval framework docs.
Bounty Features
New Data Source: DailyMed (NLM/NIH)
FDA-approved drug labeling data fetched from the DailyMed REST API. Integrated into discharge instructions for patient-friendly drug education with side effects, warnings, and proper citations.
5 Bounty Tools
- get_encounter_data — Retrieve hospital encounter/admission details
- reconcile_medications — Compare admission vs. discharge medications, flag changes
- draft_discharge_summary — AI-generated comprehensive discharge summary
- generate_discharge_instructions — Patient-friendly instructions with DailyMed drug education + scheduled follow-up appointments
- save_to_chart — Stateful document CRUD with draft/finalize workflow
Editable Discharge Drafts
Practitioners can review and edit AI-drafted discharge notes before finalizing. Edit Draft button opens an editable textarea; Save Edit persists changes; Finalize locks the document to the chart.
Scheduled Appointments
Discharge instructions include actual scheduled follow-up appointments with provider name, specialty, date, time, and location.
Setup
cd openemr/agent
npm install
cp .env.example .env
# Edit .env with your ANTHROPIC_API_KEY (required)
# Optional: LANGFUSE_SECRET_KEY, LANGFUSE_PUBLIC_KEY for observabilityRun
npm run dev # Development with hot reload
npm start # ProductionOpen http://localhost:3000 (local) or https://agent-production-6f7a.up.railway.app (production)
Test
npm test # Run 232 unit tests (Vitest)
npm run eval # Run 79 eval cases (requires ANTHROPIC_API_KEY)FHIR Data Source (OpenEMR Docker)
To use real patient data from OpenEMR:
- Start OpenEMR Docker:
docker compose up -dindocker/development-easy/ - Register OAuth2 client:
./scripts/register-oauth-client.sh - Add
FHIR_CLIENT_ID(andFHIR_CLIENT_SECRETif returned) to.env - Set
DATA_SOURCE=fhirin.env - For self-signed certs: uncomment
NODE_TLS_REJECT_UNAUTHORIZED=1in.env(dev only) - Restart the server
For iframe embedding from OpenEMR, set OPENEMR_ORIGINS=https://localhost:8300 (or your OpenEMR origin). The chat UI reads ?pid= from the URL to auto-select the patient.
Security
See SECURITY.md for the full security audit and remediation checklist. The current MVP runs with mock data — all identified issues must be resolved before connecting to real patient data.
MVP Requirements
- [x] Agent responds to NL queries in healthcare domain
- [x] 3+ functional tools (10 implemented)
- [x] Tool calls execute and return structured results
- [x] Agent synthesizes tool results
- [x] Conversation history maintained
- [x] Basic error handling
- [x] Domain-specific verification (drug interaction severity gate)
- [x] 50+ eval test cases (79 implemented)
- [x] Deployed and publicly accessible (Railway)
- [x] BOUNTY.md with customer, features, data source, impact
- [x] New data source (DailyMed REST API)
- [x] Stateful CRUD operations (document draft/edit/finalize)
