@randalliser/yaml-glossary-server
v3.0.0
Published
Unified MCP Server for YAML Glossary Management
Maintainers
Readme
Unified MCP Glossary Server (Orchestrator Architecture)
Version: 3.0 Architecture: Orchestrator pattern with specialised components Status: Production-ready with comprehensive best practices
Overview
The unified MCP server acts as an orchestrator that delegates all functionality to specialised component modules. This separation of concerns makes the codebase more maintainable, testable, and extensible.
Best Practices Implemented (v3.0)
✅ Security: Robust path sanitisation prevents directory traversal attacks ✅ Configuration: Environment-based config with sensible defaults ✅ Error Handling: Process-level handlers, standardised responses ✅ Testability: Module guard, dependency injection, no side effects ✅ Scalability: Map-based handler registry, extracted tool definitions ✅ Logging: Audit trails to stderr, never pollutes MCP protocol ✅ Consistency: Explicit defaults, predictable behaviour
Architecture
mcp-glossary-server.js (Orchestrator)
↓ delegates to
├── entry-manager.js - CRUD operations for glossary entries
├── search.js - Search and query functionality
├── importer.js - Import from TSV/CSV/JSON
├── file-manager.js - File I/O, backups, schema management
├── linter.js - Validation and quality checks
├── sorter.js - Alphabetical sorting
└── utils.js - Shared utilitiesComponents
1. Entry Manager (entry-manager.js)
Responsibilities:
- Add new glossary entries with validation
- Update existing entries
- Delete entries with reference checking
- Alphabetical insertion
Key Methods:
addEntry(glossaryPath, entry)- Add new entryupdateEntry(glossaryPath, entryId, updates)- Update entrydeleteEntry(glossaryPath, entryId)- Delete with safety checksinsertAlphabetically(terms, newEntry)- Maintain sort order
2. Search Engine (search.js)
Responsibilities:
- Flexible text search across terms, IDs, definitions, aliases
- Category filtering
- Result formatting
Key Methods:
searchEntries(glossaryPath, query, options)- Main searchfindById(glossaryPath, id)- Exact ID lookupgetByCategory(glossaryPath, category)- Category filterformatResults(results, verbose)- Display formatting
3. Importer (importer.js)
Responsibilities:
- Import glossaries from external formats
- Validate mandatory fields
- Normalise and sort imported data
Key Methods:
importFromTSV(tsvPath, outputPath)- Import TSV filesimportFromCSV(csvPath, outputPath)- Import CSV filesimportFromJSON(jsonPath, outputPath)- Import JSON files
Mandatory Fields: id, term, definition
4. File Manager (file-manager.js)
Responsibilities:
- Read/write files with backup support
- List and filter files
- Schema file management
- YAML compliance fixes
Key Methods:
readFile(filename)- Read file contentwriteFile(filename, content, options)- Write with backupbackupFile(filename)- Create timestamped backuplistFiles(pattern)- List with glob patternswriteSchemaFile(filename, content)- Write with validationfixYAMLCompliance(filename, schemaFile, fixes)- Apply fixes
5. Linter (linter.js)
Responsibilities:
- YAML syntax validation
- Schema compliance checking
- Duplicate detection (IDs, terms, aliases)
- Cross-reference validation
- Alias conflict detection
Key Methods:
lintGlossary(glossaryPath, checks)- Run linting checksformatResults(results)- Format output
Available Checks: all, duplicates, references, aliases, required-fields, sorting
6. Sorter (sorter.js)
Responsibilities:
- Alphabetical sorting by ID
- Sort validation
- Backup before sorting
Key Methods:
sortGlossaryFile(glossaryPath, createBackup)- Sort with backupsort(filePath, options)- Full control sortingisSorted(terms)- Check if already sorted
7. Utils (utils.js)
Shared Utilities:
- YAML parsing/formatting
- Field order normalisation
- Backup creation
- Color output
- File I/O helpers
MCP Tools
Entry Management
add_entry- Add new glossary entryupdate_entry- Update existing entrydelete_entry- Delete entry (with reference checking)search_entries- Search entries by query
File Operations
read_file- Read file contentwrite_file- Write file with backupbackup_file- Create backuplist_files- List files with patternwrite_schema_file- Write schema with validationfix_yaml_compliance- Fix YAML compliance issues
Import
import_from_tsv- Import from TSV fileimport_from_csv- Import from CSV fileimport_from_json- Import from JSON file
Validation
lint_glossary- Run validation checks
Sorting
sort_glossary- Sort entries alphabetically
Usage Examples
Add Entry
{
"tool": "add_entry",
"arguments": {
"glossary": "unsw-glossary/glossary-unsw-glossary-data.yaml",
"entry": {
"id": "new-term",
"term": "New Term",
"definition": "Description of the new term",
"categories": ["category1"],
"status": "current"
}
}
}Search Entries
{
"tool": "search_entries",
"arguments": {
"glossary": "unsw-glossary/glossary-unsw-glossary-data.yaml",
"query": "campus",
"category": "education"
}
}Import from TSV
{
"tool": "import_from_tsv",
"arguments": {
"tsvPath": "/absolute/path/to/glossary.tsv",
"outputPath": "imported-glossary.yaml"
}
}Lint Glossary
{
"tool": "lint_glossary",
"arguments": {
"glossary": "unsw-glossary/glossary-unsw-glossary-data.yaml",
"checks": ["duplicates", "references", "aliases"]
}
}Sort Glossary
{
"tool": "sort_glossary",
"arguments": {
"glossary": "unsw-glossary/glossary-unsw-glossary-data.yaml",
"backup": true
}
}Benefits of Orchestrator Pattern
- Separation of Concerns - Each component has a single, clear responsibility
- Testability - Components can be tested in isolation
- Reusability - Components can be used in other contexts (CLI, other servers)
- Maintainability - Changes to one component don't affect others
- Extensibility - New functionality can be added without modifying the orchestrator
- Clarity - Logic is organised by domain, not scattered across monolithic file
Migration from Old Servers
The orchestrator replaces these standalone MCP servers:
- ~~
mcp-yaml-glossary-manager.js~~ → Entry management now inentry-manager.js - ~~
mcp-yaml-glossary-sorter.js~~ → Sorting now insorter.js - ~~
mcp-yaml-glossary-file-manager.js~~ → File ops now infile-manager.js - ~~
lint-glossary.js~~ → Linting now inlinter.js - ~~
lint-duplicate-aliases.js~~ → Alias checks now inlinter.js - ~~
mcp-yaml-glossary-from-tsv.js~~ → Import now inimporter.js
All functionality is preserved and enhanced.
Development
Running Tests
npm test # Run all tests
npm test entry-manager # Test specific componentAdding New Functionality
- Create new component in
components/directory - Import component in orchestrator
- Add tool definition to
ListToolsRequestSchema - Add handler case in
CallToolRequestSchema - Add tests for component
- Update this README
Component Interface Pattern
All components follow this pattern:
class ComponentName {
constructor(glossaryDir) {
this.glossaryDir = glossaryDir;
}
async mainMethod(args) {
// Implementation
return {
success: true,
message: "Operation completed",
// ... additional data
};
}
}
module.exports = { ComponentName };Security
Path Sanitisation
Robust directory traversal prevention using path.relative:
sanitisePath(filename) {
const resolved = path.resolve(this.glossaryDir, filename);
const relative = path.relative(this.glossaryDir, resolved);
// Reject paths that escape the base directory
if (relative.startsWith("..") || path.isAbsolute(relative)) {
throw new Error(`Path traversal attempt detected: ${filename}`);
}
return resolved;
}Prevents attacks like:
../../../etc/passwdsubdir/../../outside/file.yaml/etc/passwd- Edge case:
../glossaries2/file.yamlwhen base is/var/glossaries
Source Path Resolution
Handles absolute and relative import paths:
resolveSourcePath(sourcePath) {
if (path.isAbsolute(sourcePath)) {
return sourcePath; // Allow absolute paths for imports
}
return path.resolve(this.glossaryDir, sourcePath);
}Error Handling
All components throw descriptive errors that the orchestrator catches and formats:
throw new Error("Descriptive error message");The orchestrator formats errors consistently:
{
"content": [{ "type": "text", "text": "Error: Descriptive error message" }],
"isError": true
}Global Error Handlers
process.on("unhandledRejection", (err) => {
console.error("[MCP] Unhandled rejection:", err);
process.exit(1);
});
process.on("uncaughtException", (err) => {
console.error("[MCP] Uncaught exception:", err);
process.exit(1);
});Configuration
Environment Variables
GLOSSARY_WORKSPACE_ROOT=/custom/workspace # Default: path.resolve(__dirname, "..")
GLOSSARY_DIR=/custom/glossaries # Default: {workspace_root}/glossariesConstructor Injection
const server = new UnifiedGlossaryMCP({
workspaceRoot: "/test/workspace",
glossaryDir: "/test/glossaries"
});Paths in tool arguments are relative to the glossaries directory unless absolute.
Logging
All logs go to stderr (not stdout) to avoid corrupting MCP JSON-RPC protocol on stdout.
Troubleshooting
Tools Not Registering
Symptom: MCP tools don't appear in VS Code Copilot
Primary Cause: STDIO protocol corruption from logging to stdout
🚨 CRITICAL RULE: MCP servers communicate via STDIO JSON-RPC. Never write to stdout except for protocol messages.
// ❌ BAD - corrupts MCP protocol
console.log("Debug message");
console.log(result);
// ✅ GOOD - logs to stderr
console.error("Debug message");
console.error("Result:", JSON.stringify(result));Note: The linter.js and sorter.js components have console.log() for CLI usage only. These are never called during MCP protocol communication.
Test with MCP Inspector
Interactive testing:
npx @modelcontextprotocol/inspector node mcp-glossary-server.js
# Opens browser at http://localhost:6274CLI testing (automation):
# List available tools
npx @modelcontextprotocol/inspector --cli node mcp-glossary-server.js --method tools/list
# Test add_entry tool
npx @modelcontextprotocol/inspector --cli node mcp-glossary-server.js \
--method tools/call \
--tool-name add_entry \
--tool-arg glossary="test/glossary.yaml" \
--tool-arg 'entry={"id":"test","term":"Test","definition":"Test term"}'
# Test search_entries tool
npx @modelcontextprotocol/inspector --cli node mcp-glossary-server.js \
--method tools/call \
--tool-name search_entries \
--tool-arg glossary="test/glossary.yaml" \
--tool-arg query="campus"Configuration Issues
Verify .vscode/mcp.json configuration:
{
"mcpServers": {
"yaml-glossary": {
"command": "node",
"args": [
"C:\\Users\\...\\mcp\\yaml-glossary-server\\mcp-glossary-server.js"
],
"env": {
"GLOSSARY_WORKSPACE_ROOT": "C:\\Users\\...\\RAG Glossary",
"GLOSSARY_DIR": "C:\\Users\\...\\RAG Glossary\\glossaries"
}
}
}
}Configuration requirements:
- ✅ Use absolute paths (not relative)
- ✅ Ensure
mcp-glossary-server.jsexists at specified path - ✅ Ensure
GLOSSARY_DIRpoints to existing directory - ✅ No trailing slashes in paths (Windows can be inconsistent)
Path Sanitisation Errors
Symptom: "Path traversal attempt detected"
Cause: Security feature preventing access outside glossary directory
Solutions:
Use relative paths from glossary directory:
{"glossary": "unsw-glossary/glossary-unsw-glossary-data.yaml"} // Not: "C:\\full\\path\\..." or "../../../etc/passwd"Check environment variable
GLOSSARY_DIRpoints to correct base:// All file operations are relative to GLOSSARY_DIR // So "unsw-glossary/file.yaml" -> "{GLOSSARY_DIR}/unsw-glossary/file.yaml"
Component-Specific Issues
Entry Manager
"Entry already exists": Use update_entry instead of add_entry
"Entry not found": Check id field matches exactly (case-sensitive)
"Cannot delete - referenced by other entries": Remove references first or use force flag
Linter
"YAML syntax error": Run lint_glossary with all checks to see details
"Duplicate aliases found": Each alias must be unique across all entries
"Broken reference": Referenced entry ID doesn't exist
Importer
"Mandatory field missing": TSV/CSV must have columns: id, term, definition
"Invalid JSON": Check JSON file syntax before importing
"Import failed": Check source file encoding (must be UTF-8)
Sorter
"File already sorted": No action needed
"Backup failed": Check write permissions in glossary directory
YAML File Issues
Symptom: "YAML parse error" or malformed output
Solutions:
Use YAML compliance tool:
{ "tool": "fix_yaml_compliance", "arguments": { "filename": "glossary.yaml", "schemaFile": "glossary-schema.yaml", "fixes": ["format", "order"] } }Check for common issues:
- Mixed tabs and spaces (use spaces only)
- Inconsistent indentation (2 spaces per level)
- Unquoted special characters (
:,#,|) - Missing newline at end of file
Debugging Logs
Enable verbose logging:
// Temporarily add to mcp-glossary-server.js
console.error("[MCP-YAML] Tool called:", toolName);
console.error("[MCP-YAML] Arguments:", JSON.stringify(args, null, 2));Run server manually to see logs:
node mcp-glossary-server.js 2> server-debug.log
# In another terminal
npx @modelcontextprotocol/inspector --cli node mcp-glossary-server.js --method tools/list
# Check logs
cat server-debug.logCommon Error Messages
| Error | Cause | Solution |
|-------|-------|----------|
| "Invalid JSON-RPC" | stdout pollution | Remove console.log() from orchestrator |
| "Tool not found" | Typo in tool name | Check tool names: add_entry, not addEntry |
| "Path traversal" | Absolute path used | Use relative path from GLOSSARY_DIR |
| "ENOENT" | File not found | Check glossary file exists |
| "Permission denied" | No write access | Check file/directory permissions |
| "Invalid entry" | Missing required field | Ensure id, term, definition present |
Performance Optimisation
Large glossaries (1000+ entries):
Use category filters:
{"tool": "search_entries", "query": "*", "category": "education"}Disable verbose output:
{"tool": "search_entries", "query": "term", "verbose": false}Sort glossaries (improves search performance):
{"tool": "sort_glossary", "glossary": "large.yaml", "backup": true}
Validation Checklist
✅ Before reporting issues:
- ☑️ Server file exists at configured path
- ☑️ No stdout logs in orchestrator:
grep "console.log" mcp-glossary-server.jsempty - ☑️ Inspector shows tools:
npx @modelcontextprotocol/inspector --cli ... - ☑️ Environment variables set:
GLOSSARY_DIR,GLOSSARY_WORKSPACE_ROOT - ☑️ Paths are absolute in config
- ☑️ Glossary files are valid YAML: Test with
lint_glossary - ☑️ File permissions correct: Can read/write glossary directory
Testing Individual Components
Test components outside MCP (for development):
# Test entry manager
node -e "const {EntryManager} = require('./components/entry-manager'); const em = new EntryManager('./glossaries'); em.addEntry('test.yaml', {id:'test',term:'Test',definition:'Test'}).then(console.log);"
# Test linter
node components/linter.js glossaries/test/glossary.yaml all
# Test sorter
node components/sorter.js glossaries/test/glossary.yaml --dry-runNote: Component CLI modes use console.log() for output. This is fine - they're not called during MCP communication.
Additional Resources
Audit Logging
Every tool invocation is logged:
logToolInvocation(toolName, args) {
console.error(`[MCP] Tool invoked: ${toolName}`, JSON.stringify(sanitisedArgs));
}Testing
npm test # Run all tests
npm run test:watch # Watch mode
npm run test:coverage # With coverage reportTest Coverage Areas
- Configuration loading (environment, defaults)
- Path sanitisation (security)
- Source path resolution
- Response formatting (success, error, custom formats)
- Tool registration and schema validation
- Module exports and testability
Module Guard
if (require.main === module) {
const server = new UnifiedGlossaryMCP();
server.run().catch((error) => {
console.error("[MCP] Fatal error:", error);
process.exit(1);
});
}Allows importing without side effects for testing.
Version History
- v3.0.0 - Best practice implementation (security, testability, configuration)
- v3.0 - Refactored to orchestrator architecture with component separation
- v2.0 - Unified MCP server with monolithic implementation
- v1.0 - Multiple separate MCP servers
Migration from v2.0
Breaking Changes
Module requires no longer auto-start server
- Before:
require()started server immediately - After: Use
require.main === moduleguard
- Before:
Path handling changed
- Components now receive absolute paths
- Orchestrator handles all path resolution/sanitisation
Error responses no longer include ANSI color codes
- Plain text only for protocol compatibility
Non-Breaking Enhancements
- Environment variable configuration
- Constructor dependency injection
- Exported
TOOL_DEFINITIONSfor external use - Enhanced security with better path validation
