@adarsh6938/mcp-knowledge-graph-semantic

v1.1.3

Published

a year ago

Private MCP Server for semantic knowledge graph with persistent memory

0High
0Medium
0Low

adarsh6938

mcp knowledge-graph semantic-search memory ai private

Personal Knowledge Graph with Semantic Search

A powerful MCP (Model Context Protocol) server that provides persistent memory using a local knowledge graph with semantic search capabilities. Built for personal use with Claude/Cursor to maintain context across conversations.

Features

🧠 Persistent Memory: Store and retrieve information across chat sessions
🔍 Semantic Search: Find relevant information based on meaning, not just keywords
🔗 Knowledge Graph: Entities and relationships for structured knowledge storage
📄 Pagination: Handle large datasets without response size limits
🚀 Local & Private: All data stays on your machine
💰 Cost-Free: Uses open-source Transformers.js models (no API costs) 🛡️ Smart Entity Management: Automatic entity health monitoring and bloat prevention 🤖 Auto-Split Entities: Automatically reorganize oversized entities without interruption 🎯 Configurable: Customize limits and categories for any domain or use case ⏰ Temporal Tracking: Automatic timestamps and session tracking for all activities 🔄 Session Continuity: Smart context retrieval across chat sessions 📊 Activity Detection: Automatic categorization of work types (coding, debugging, planning) 💬 Chat Transitions: 🆕 Save/restore complete session summaries between chat windows 🔗 Context Carryover: 🆕 Perfect continuity when switching chat sessions

Quick Start

Installation

npm install -g @adarsh6938/mcp-knowledge-graph-semantic

Configuration

Add to your .cursor/mcp.json or claude_desktop_config.json:

{
  "mcpServers": {
    "knowledge-graph-semantic": {
      "command": "npx",
      "args": [
        "-y",
        "@adarsh6938/mcp-knowledge-graph-semantic",
        "--memory-path",
        "/path/to/your/memory.jsonl"
      ]
    }
  }
}

Core Concepts

Enhanced Entities with Temporal Tracking

Primary nodes in your knowledge graph with automatic temporal tracking:

{
  "name": "John_Doe",
  "entityType": "person",
  "sessionId": "session_2025_01_19_10_15",
  "observations": [
    {
      "content": "Software engineer specializing in contract testing",
      "timestamp": "2025-01-19T10:15:32.123Z",
      "sessionId": "session_2025_01_19_10_15",
      "activityType": "discussion"
    },
    "Uses TypeScript and Java for development" // Legacy format still supported
  ]
}

Relations with Session Tracking

Connections between entities with temporal awareness:

{
  "from": "John_Doe",
  "to": "Alpha_workspace",
  "relationType": "works_in",
  "sessionId": "session_2025_01_19_10_15"
}

Automatic Session Management

Session Detection: Automatically creates new sessions after 30+ minute gaps
Session IDs: Auto-generated format session_YYYY_MM_DD_HH_MM
Activity Types: Automatically detects coding, planning, debugging, discussion, completion, learning, research
Backward Compatibility: Supports both enhanced observations and legacy string format

Available Tools

Core Operations

create_entities - Add new entities to the graph with names, types, and observations
create_relations - Connect entities with typed relationships
add_observations - Add facts/information to existing entities (with smart suggestions)
update_entities - Modify existing entity names, types, or observations
update_relations - Modify existing relationship types or connections

Deletion & Cleanup

delete_entities - Remove entities and their connections
delete_observations - Remove specific facts from entities
delete_relations - Remove connections between entities

Reading & Discovery

read_graph - Get limited view (first 5 entities) for quick overview
read_graph_paginated - Browse large datasets with pagination control
open_nodes - Get specific entities by name with all their relationships
search_nodes - Keyword-based search across entity names and observations
semantic_search - AI-powered semantic search with temporal awareness
hybrid_search - Combined keyword + semantic search for comprehensive results

🆕 Temporal Context & Session Continuity

get_recent_context - Retrieve last 24 hours of activity with recency priority
get_related_work - Find semantically related work from specified time windows
get_historical_overview - Get key entities and session summaries from older timeframes
get_session_continuity_context - Smart context for new chat sessions with confidence scoring
save_current_session_summary - 🆕 Save comprehensive chat session summary for context carryover
get_last_session_summary - 🆕 Retrieve previous session summary when starting new chat

Smart Entity Management

analyze_entity_health - Identify bloated entities that need splitting
split_entity - Break down oversized entities into organized components
configure_entity_management - Customize behavior, limits, and categories
get_entity_management_config - View current configuration settings

System Maintenance

rebuild_semantic_index - Refresh semantic search index for better performance

🔄 Session Continuity & Temporal Tracking

NEW: Complete solution for context window exhaustion with automatic session continuity!

The Problem

When AI context windows fill up, users start new chats and lose work continuity. This system solves that with intelligent temporal tracking and context retrieval.

Automatic Temporal Tracking ⏰

Every observation now includes:

Timestamp: Precise creation time
Session ID: Automatically assigned session identifier
Activity Type: Auto-detected work category
Content: The actual information

interface ObservationData {
  content: string;
  timestamp: string;
  sessionId: string;
  activityType: 'coding' | 'planning' | 'debugging' | 'discussion' | 'completion' | 'learning' | 'research';
}

Session Boundary Detection 🎯

Automatic Sessions: New session created after 30+ minute gaps
No Manual Work: Everything happens automatically
Session Progression: Clear tracking of work evolution
Activity Categorization: Smart detection of work types

Tiered Context Retrieval System 📊

1. Recent Context (`get_recent_context`)

Purpose: Last 24 hours of activity
Prioritization: Recent activities weighted higher
Session Awareness: Groups activities by session
Use Case: Quick continuity for ongoing work

// Example usage
get_recent_context({
  hoursBack: 24,
  maxResults: 20
})

2. Related Work (`get_related_work`)

Purpose: Semantically similar work from time windows
Session Clustering: Groups related activities across sessions
Semantic Matching: Finds conceptually similar work
Use Case: Finding past work relevant to current task

// Example usage
get_related_work({
  query: "temporal tracking implementation",
  daysBack: 7,
  maxResults: 15
})

3. Historical Overview (`get_historical_overview`)

Purpose: Key entities and patterns from older work
Session Summaries: High-level view of past sessions
Entity Importance: Focuses on most significant entities
Use Case: Understanding long-term patterns and key topics

// Example usage
get_historical_overview({
  excludeDays: 7,
  maxResults: 10
})

4. Session Continuity Context (`get_session_continuity_context`)

Purpose: Intelligent automatic context for new sessions
Confidence Scoring: Rates context availability quality
Smart Recommendations: Suggests most relevant context
Use Case: Zero-effort session continuation

// Example usage  
get_session_continuity_context({
  query: "continue work on session continuity"
})

Enhanced Semantic Search 🔍

The regular semantic_search now includes temporal awareness:

Combined Scoring: 70% semantic similarity + 30% recency score
Session Tracking: Results show session progression
Activity Types: Each result includes detected activity type
Temporal Ordering: Recent relevant results prioritized

Activity Type Detection 🤖

Automatically categorizes observations:

coding: Implementation, debugging, code reviews
planning: Architecture decisions, task planning, requirements
debugging: Problem solving, error investigation, testing
discussion: Conversations, explanations, knowledge sharing
completion: Finished tasks, achievements, milestones
learning: New concepts, research, skill development
research: Information gathering, analysis, exploration

Session Management

Format: session_YYYY_MM_DD_HH_MM
Auto-Creation: New session after 30+ minute gaps
Preservation: All temporal data preserved during entity operations
Audit Trail: Complete history of all activities

Use Cases for Session Continuity

Context Window Exhaustion: Start new chat with full context
Work Resumption: Pick up where you left off after breaks
Project Evolution: Track how work develops over time
Knowledge Retention: Never lose important context or decisions
Collaboration: Share complete context with team members

🆕 Chat Window Transitions & Session Summaries

NEW: Perfect solution for maintaining context when switching between chat windows!

The Challenge

When context windows fill up or you need to start fresh chats, you lose conversation continuity. Session summaries solve this by capturing and restoring complete context.

Session Summary System 💬

When User Says "End of Chat"

save_current_session_summary({
  summary: "Comprehensive summary of what happened in this chat session..."
})

Automatically Captures:

All Session Entities: Every entity created/modified in this session
All Session Relations: Connections between entities from this session
Activity Analysis: Work types (coding, planning, debugging, etc.)
Time Boundaries: Auto-detected session start/end times
Rich Metadata: Entity count, relation count, observation count

When User Says "New Chat"

get_last_session_summary()

Instantly Restores:

Complete Previous Context: Full summary of last session
Recent Session History: Last 5 session summaries for broader context
Metadata Overview: Quick stats about previous work
Seamless Continuation: Pick up exactly where you left off

Session Summary Features

Automatic Storage: Saves to *_session_summaries.json alongside memory
Smart Cleanup: Keeps last 50 summaries, auto-removes older ones
Zero Configuration: Works with existing session management
Rich Context: Captures all entities, relations, and temporal data
Perfect Carryover: Complete context restoration for new chats

Use Cases for Session Summaries

Context Window Exhaustion: Save before hitting limits, restore in new chat
Daily Work Transitions: End work sessions, resume next day with full context
Project Handoffs: Share complete session context with team members
Multi-Chat Workflows: Switch between different chat windows seamlessly
Long-Term Projects: Maintain continuity across weeks/months of work

Smart Entity Management

The MCP now includes intelligent entity management to prevent bloated entities and maintain clean knowledge graphs.

Automatic Entity Splitting 🤖

NEW: The system can now automatically split oversized entities to prevent bloat:

Auto-Split: When entities exceed the limit (20 observations), they're automatically reorganized
Smart Categorization: Observations are grouped by type (activities, tools, problems, etc.)
Preserved Relationships: Original connections are maintained through new semantic relations
Seamless Operation: No interruption to your workflow - splitting happens transparently
Configurable: Can be disabled via enableAutoSplit: false if manual control preferred
Temporal Preservation: All timestamps and session data preserved during splits

Example auto-split behavior:

🤖 Auto-splitting entity "John_Doe" (15 + 8 = 23 observations exceed limit)
✅ Created: John_Doe_activities_and_actions (8 observations)
✅ Created: John_Doe_tools_and_technologies (6 observations)  
✅ Created: John_Doe_problem_solving (4 observations)
🔗 Relations: John_Doe → has_activities_and_actions → John_Doe_activities_and_actions
⏰ Temporal data preserved across all split entities

Default Protection

The system automatically:

Warns when entities exceed 12 observations
Auto-splits entities exceeding 20 observations (when enabled)
Suggests splitting entities with mixed content types
Categorizes observations into 7 universal types
Maintains semantic relationships during reorganization
Preserves temporal tracking data during all operations

Universal Categories

The system recognizes these domain-agnostic patterns:

activities_and_actions - Things people do (working on, developing, managing, learning)
tools_and_technologies - Software, frameworks, systems they use
problem_solving - Issues encountered and solutions found
knowledge_and_learning - Things learned, studied, or understood
projects_and_goals - Projects, objectives, milestones
relationships_and_interactions - People interactions, meetings, collaborations
processes_and_workflows - Procedures, methodologies, best practices

Configuration Options

Basic limits: Adjust maxObservationsPerEntity, warningThreshold, optimalObservationsCount
Smart suggestions: Enable/disable automatic categorization suggestions
Custom categories: Define domain-specific patterns for specialized use cases
Domain templates: Pre-configured setups for medical, software development, research, etc.

Key Features

Health Analysis: analyze_entity_health identifies bloated entities with category breakdowns
Smart Splitting: split_entity reorganizes oversized entities into logical components
Automatic Relationships: Creates proper semantic connections between split entities
Intelligent Warnings: Provides actionable suggestions when adding observations
Universal Design: Works across any domain without configuration
Temporal Awareness: All operations preserve session and timestamp data

Search & Discovery

Semantic Search: Find information by meaning with temporal awareness
Keyword Search: Traditional text-based search across entities and observations
Hybrid Search: Combines semantic and keyword search for comprehensive results
Pagination: Handle large knowledge graphs efficiently
Temporal Context: Specialized tools for time-aware context retrieval
Session Continuity: Smart context for seamless chat transitions

Technical Details

Storage: JSONL format for entities/relations with temporal extensions
Embeddings: Transformers.js with all-MiniLM-L6-v2 model
Search: Cosine similarity with temporal scoring and configurable thresholds
Memory: Automatic indexing when entities are created/modified
Sessions: Automatic session boundary detection with 30+ minute gaps
Temporal Scoring: Combined semantic similarity (70%) + recency score (30%)
Activity Detection: Pattern-based automatic categorization
Backward Compatibility: Supports both enhanced and legacy observation formats

Use Cases

Personal Assistant: Remember preferences, goals, and context across sessions
Project Memory: Track technical decisions and implementations over time
Learning: Store and connect knowledge across domains with temporal context
Development: Maintain context about codebases and architectures
Session Continuity: Seamlessly resume work after context window exhaustion
Work Evolution: Track how projects and understanding develop over time
Knowledge Retention: Never lose important context or decisions
Collaboration: Share complete temporal context with team members

Configuration Options

Memory Path

"args": ["--memory-path", "/Users/you/projects/memory.jsonl"]

Multiple Projects

Use different memory files for different contexts:

// Work project
"--memory-path", "/Users/you/work/work-memory.jsonl"

// Personal project  
"--memory-path", "/Users/you/personal/personal-memory.jsonl"

System Prompt Recommendation

Add this to your Claude/Cursor configuration:

Follow these steps for each interaction:

1. User Identification:
   - You should assume that you are interacting with default_user
   - If you have not identified default_user, proactively try to do so.

2. Memory Retrieval:
   - Always begin your chat by saying only "Remembering..." and retrieve relevant information from your knowledge graph
   - Always refer to your knowledge graph as your "memory"
   - **🆕 CHAT TRANSITIONS:**
     - **When user says "new chat"**: Use get_last_session_summary to restore previous session context
     - **When user says "end of chat"**: Use save_current_session_summary to preserve session for next chat
   - **CHOOSE ONE PRIMARY TOOL for normal memory retrieval (do not call multiple):**
     - **BEST CHOICE**: Use get_session_continuity_context for comprehensive automatic context with confidence scoring
     - **OR** get_recent_context if you only need last 24 hours
     - **OR** semantic_search if you have a specific query
     - **OR** hybrid_search for keyword + semantic combined search
   - **Additional tools only if needed:**
     - get_related_work, get_historical_overview, search_nodes, open_nodes, read_graph_paginated
   - Available tools: create_entities, create_relations, add_observations, update_entities, update_relations, delete_entities, delete_observations, delete_relations, read_graph, read_graph_paginated, search_nodes, semantic_search, hybrid_search, open_nodes, rebuild_semantic_index, analyze_entity_health, split_entity, configure_entity_management, get_entity_management_config, get_recent_context, get_related_work, get_historical_overview, get_session_continuity_context, save_current_session_summary, get_last_session_summary

3. Memory Health:
   - Periodically use analyze_entity_health to check for bloated entities and get category breakdowns
   - If entities exceed recommended sizes (12+ observations), suggest using split_entity to reorganize them
   - Use configure_entity_management to customize limits and categories for domain-specific use cases
   - Use get_entity_management_config to check current settings
   - Monitor for smart suggestions when using add_observations to prevent entity bloat

4. Memory:
   - While conversing with the user, be attentive to any new information that falls into these categories:
     a) Basic Identity (age, gender, location, job title, education level, etc.)
     b) Behaviors (interests, habits, etc.)
     c) Preferences (communication style, preferred language, etc.)
     d) Goals (goals, targets, aspirations, etc.)
     e) Relationships (personal and professional relationships up to 3 degrees of separation)
     f) Technical knowledge (implementations, decisions, learnings)

5. Memory Update:
   - If any new information was gathered during the interaction, update your memory as follows:
     a) Create entities for recurring organizations, people, and significant events
     b) Connect them to the current entities using relations
     c) Store facts about them as observations (automatically gets timestamps and session tracking)
     d) Update entities and relations as information evolves
     e) Delete outdated entities, relations, or observations when needed
     f) Pay attention to smart suggestions when adding observations to prevent entity bloat
     g) When warnings appear about entity size, consider splitting into domain-specific entities
     h) Use the universal categorization system to organize information appropriately
     i) All observations automatically include temporal tracking and activity type detection

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

Personal Knowledge Graph with Semantic Search

Features

Quick Start

Installation

Configuration

Core Concepts

Enhanced Entities with Temporal Tracking

Relations with Session Tracking

Automatic Session Management

Available Tools

Core Operations

Deletion & Cleanup

Reading & Discovery

🆕 Temporal Context & Session Continuity

Smart Entity Management

System Maintenance

🔄 Session Continuity & Temporal Tracking

The Problem

Automatic Temporal Tracking ⏰

Session Boundary Detection 🎯

Tiered Context Retrieval System 📊

1. Recent Context (get_recent_context)

2. Related Work (get_related_work)

3. Historical Overview (get_historical_overview)

4. Session Continuity Context (get_session_continuity_context)

Enhanced Semantic Search 🔍

Activity Type Detection 🤖

Session Management

Use Cases for Session Continuity

🆕 Chat Window Transitions & Session Summaries

The Challenge

Session Summary System 💬

When User Says "End of Chat"

When User Says "New Chat"

Session Summary Features

Use Cases for Session Summaries

Smart Entity Management

Automatic Entity Splitting 🤖

Default Protection

Universal Categories

Configuration Options

Key Features

Search & Discovery

Technical Details

Use Cases

Configuration Options

Memory Path

Multiple Projects

System Prompt Recommendation

1. Recent Context (`get_recent_context`)

2. Related Work (`get_related_work`)

3. Historical Overview (`get_historical_overview`)

4. Session Continuity Context (`get_session_continuity_context`)