@qty/memfs

v2.4.18

Published

12 days ago

Knowledge graph management system with BM25 + fuzzy search, inspired by filesystem concepts

0High
0Medium
0Low

qtgcy

knowledge-graph mcp bm25 fuzzy-search memory notes

🧠 MemFS

A knowledge graph management system based on MCP server-memory, deeply refactored with filesystem-inspired design

💡 Acknowledgments: Original @modelcontextprotocol/server-memory
Inspired by it, though heavily reimagined.

🎯 One-Line Description

Bringing modern filesystem concepts to knowledge graph management, combined with BM25 + fuzzy search for intelligent retrieval, designed for LLM-assisted humanities and social sciences research.

📖 中文版文档: docs/README_zh-CN.md

🚀 Quick Start

Prerequisites

# Check Node.js version
node --version  # Must be v22.0.0 or higher

Installation & Run

Quickest way (npx):

npx @qty/memfs

Or clone and run:

# 1. Clone or download the project
cd MemFS

# 2. Install dependencies
npm install

# 3. Run server
node index.js

# Or specify custom storage directory
MEMORY_DIR=~/my-knowledge

# Enable Git auto-sync (auto-commits on every save)
GITAUTOCOMMIT=true node index.js

Configure as MCP Server

OpenCode format:

{
  "mcpServers": {
    "memory": {
      "type": "local",
      "command": ["npx", "-y", "@qty/memfs"],
      "enabled": true
    }
  }
}

VSCode / ClaudeCode / Cherry Studio / AstrBot format:

{
  "mcpServers": {
    "memory": {
      "command": "npx",
      "args": ["-y", "@qty/memfs"],
      "enabled": true
    }
  }
}

📰 What's New in 2.4.12

Git Auto-Commit

New GITAUTOCOMMIT=true environment variable — every operation auto-commits to Git:

auto-commit:[createEntity "Weber"] at [utc:2026-03-28T12:34:56.789Z] [tz:Asia/Shanghai]

searchNode Refactoring

Response simplified to { entities, observations, relations }
Removed searchMode field
observations now includes updatedAt
Related entities count limited by limit
Unified tokenization: 2~(n-1) gram, no language detection
2-gram penalty ×0.5: short tokens no longer over-match
Field weights centralized in DEFAULT_FIELD_WEIGHTS
definitionSource added to search index
Relation type matching boost

Operation Return Refactoring

Write operation messages simplified (less LLM token consumption)
Delete operations return full data for potential undo
unlinkObservation now uses ID-based input (renamed from deleteObservation)

Auxiliary Tools

New getConsole tool to retrieve buffered logs

📖 Core Concepts

| Concept | Description | Analogy | |---------|-------------|---------| | Entity | Nodes in the knowledge graph | File | | Observation | Properties/descriptions of entities | inode | | Relation | Connections between entities | Soft link | | Reference | Pointers from entities to observations | Hard link |

💡 Core Design Philosophy

1. Transformer-Ready: On-Demand Retrieval

flowchart TD
    LLM["LLM"]
    ATT["Attention Mechanism"]
    MCP["MCP Protocol"]
    MEM["MemFS\nOn-demand structured data"]

    LLM --> ATT --> MCP --> MEM
    MEM -.-> |Returns results| LLM

Core principle: Don't stuff all knowledge into context—retrieve on demand.

2. Lightweight Design

| Dimension | Traditional Solution | MemFS | |-----------|---------------------|-------| | Deployment | Database + Vector Engine | Pure Node.js | | Resources | GPU recommended, high memory | CPU only | | Explainability | Black-box models | BM25 transparent & controllable |

3. Local JSONL Storage

{"type":"entity","name":"Weber","entityType":"person","definition":"German sociologist","observationIds":[1,2]}
{"type":"observation","id":1,"content":"Author of 'The Protestant Ethic'","createdAt":{"utc":"2026-02-08T13:53:07Z","timezone":"Asia/Shanghai"}}
{"type":"relation","from":"Weber","to":"Durkheim","relationType":"contemporary"}

Advantages: Editable with any text editor, Git-version-controllable, printable.

4. Humanities & Social Sciences Customization

| Requirement Type | Traditional | MemFS | |-----------------|-------------|-------| | Knowledge units | Functions/Classes | Concepts/People/Documents | | Relationship types | Function calls | Influence/Reference/Comparison | | Update frequency | High-frequency | Low-frequency add, high-frequency reference |

📦 Complete API Tools (16 total)

Create

| Tool | Function | Example | |------|----------|---------| | createEntity | Batch create entities (with observations) | Add concepts, people, documents | | createRelation | Create relations between entities | Mark references, comparisons, influences | | addObservation | Add observations to existing entities | Supplement reading notes |

Read

| Tool | Function | Example | |------|----------|---------| | searchNode | BM25 + Fuzzy hybrid search | Intelligent knowledge search | | readNode | Read complete entity information | Get detailed attributes and relations | | readObservation | Batch read observations by ID | Verify specific observations | | listNode | List all entity overviews | Browse knowledge structure | | listGraph | Read entire knowledge graph | Batch export, migration | | howWork | Get recommended workflow guidance | Learn how to use the system |

Update

| Tool | Function | Example | |------|----------|---------| | updateNode | Update entities and observations (Copy-on-Write) | Modify definitions, update notes | | updateObservation | Batch update observation content | Batch correct information |

Delete

| Tool | Function | Example | |------|----------|---------| | deleteEntity | Delete entities and relations | Remove outdated entries | | deleteRelation | Delete specific relations | Unlink entities | | unlinkObservation | Unlink observations (preserve observation) | Remove references | | getOrphanObservation | Find orphan observations | Discover invalid data | | recycleObservation | Permanently delete observations | Clean up unused data |

Auxiliary

| Tool | Function | Example | |------|----------|---------| | getConsole | Get console messages and Git commit logs | View auto-commit history |

🔍 Hybrid Search (searchNode)

Core Features

| Feature | Description | |---------|-------------| | BM25 | Considers term frequency and document frequency | | Fuzzy Search | Tolerates typos, supports approximate matching | | Query Tokenization | Tokenize → Search individually → Aggregate → Deduplicate | | Weighted Fusion | BM25 0.7 + Fuzzy 0.3, combined ranking |

Parameters

// Default hybrid search
await searchNode("functionalism");  // BM25 + Fuzzy

// Traditional keyword search
await searchNode("functionalism", { basicFetch: true });

// Custom parameters
await searchNode("sociology", {
    limit: 15,          // Return count
    bm25Weight: 0.7,    // BM25 weight
    fuzzyWeight: 0.3,   // Fuzzy search weight
    minScore: 0.01      // Minimum relevance threshold
});

Field Weights

| Field | Weight | Description | |-------|--------|-------------| | name | 5.0 | Highest - entity name | | entityType | 2.5 | Entity type | | definition | 2.5 | Definition description | | definitionSource | 1.5 | Definition source | | observation | 1.0 | Observation content |

🔧 Filesystem-Inspired Design

Architecture Analogy

| Filesystem Concept | MemFS Implementation | Solves | |-------------------|---------------------|--------| | Inode Table | Centralized observation storage | Data redundancy | | Hard Links | Multiple entities reference same observation | Shared reuse | | Soft Links | Entity relations | Flexible associations | | Copy-on-Write | Copy-on-Write updates | Concurrency safety | | Orphan Detection | Orphan observation cleanup | Resource recovery |

Observation Sharing

// Create two entities sharing the same observation
await createEntity([
  { name: "Zhang San", observations: ["Programmer"] },
  { name: "Li Si", observations: ["Programmer"] }
]);

// Under the hood: same observation ID is reused
{
  entities: [
    { name: "Zhang San", observationIds: [1] },
    { name: "Li Si", observationIds: [1] }
  ],
  observations: [
    { id: 1, content: "Programmer" }
  ]
}

Copy-on-Write

// Update a shared observation
await updateNode({
  entityName: "Zhang San",
  observationUpdates: [
    { oldContent: "Programmer", newContent: "Senior Programmer" }
  ]
});

// Result: Zhang San gets new observation, Li Si keeps original
{
  observations: [
    { id: 1, content: "Programmer" },      // Li Si uses
    { id: 2, content: "Senior Programmer" } // Zhang San's new observation
  ]
}

📁 Data Format

JSONL Storage

{"type":"entity","name":"Weber","entityType":"person","definition":"German sociologist","definitionSource":"Wikipedia","observationIds":[1,2]}
{"type":"entity","name":"Durkheim","entityType":"person","definition":"French sociologist","definitionSource":"Wikipedia","observationIds":[3]}
{"type":"observation","id":1,"content":"Author of 'The Protestant Ethic'","createdAt":{"utc":"2026-02-08T13:53:07Z","timezone":"Asia/Shanghai"}}
{"type":"observation","id":2,"content":"Contemporary with Durkheim and Marx","createdAt":{"utc":"2026-02-08T14:00:00Z","timezone":"Asia/Shanghai"},"updatedAt":{"utc":"2026-02-09T10:30:00Z","timezone":"Asia/Shanghai"}}
{"type":"observation","id":3,"content":"Author of 'The Division of Labor in Society'","createdAt":{"utc":"2026-02-08T15:00:00Z","timezone":"Asia/Shanghai"}}
{"type":"relation","from":"Weber","to":"Durkheim","relationType":"contemporary"}

Storage Locations

| Method | Path | |--------|------| | Default | ~/.memory/memory.jsonl | | Custom directory | MEMORY_DIR=/path/to/data |

Environment Variables

| Variable | Description | Default | Status | |----------|-------------|---------|--------| | MEMORY_DIR | Data storage directory | ~/.memory | ✅ Recommended | | MEMORY_FILE_PATH | Full file path (deprecated) | ~/.memory/memory.jsonl | ⚠️ Deprecated | | GITAUTOCOMMIT | Enable Git auto-commit on every save | false | ✅ Recommended |

🔄 Git Auto-Sync

When enabled, every save to the memory file is automatically committed to Git for version control.

# Enable Git auto-commit
GITAUTOCOMMIT=true node index.js

# Or in MCP config
{
  "environment": {
    "MEMORY_DIR": "/path/to/data",
    "GITAUTOCOMMIT": "true"
  }
}

Commit Format

auto-commit:[operationContext] at [utc:YYYY-MM-DDTHH:mm:ss.SSSZ] [tz:Asia/Shanghai]

Example:

auto-commit:[createEntity "Weber"] at [utc:2026-03-22T09:15:30.123Z] [tz:Asia/Shanghai]
auto-commit:[updateNode "Durkheim"] at [utc:2026-03-22T09:16:45.456Z] [tz:Asia/Shanghai]
auto-commit:[deleteRelation "Weber"→"Durkheim"] at [utc:2026-03-22T09:17:00.789Z] [tz:Asia/Shanghai]

auto-sync: (operation_type "details") at UTC YYYY-MM-DDTHH:mm:ss.SSSZ


Example:

auto-sync: (createEntity "Weber") at UTC 2026-03-22T09:15:30.123Z auto-sync: (updateNode "Durkheim") at UTC 2026-03-22T09:16:45.456Z auto-sync: (deleteRelation "Weber"→"Durkheim") at UTC 2026-03-22T09:17:00.789Z


### View Commit History

Use `getConsole` tool:

```javascript
await getConsole()
// Returns text content with buffered logs and Git commits prefixed by "[Git]"

📦 Legacy Version

The v1.3.0 code is available on the legacy branch:

git clone https://github.com/Qtgcy08/MemFS.git
cd MemFS
git checkout legacy

If you're using MEMORY_FILE_PATH, please migrate to MEMORY_DIR before upgrading.

🧪 Testing

# Full test suite (22 tests)
node test_mcp_full.mjs

# Git Sync tests
node test_gitsync.mjs

⚙️ Comparison with Original MCP Memory

| Dimension | Original | MemFS | |-----------|----------|-------| | Observation Storage | Embedded in entities | Centralized + ID reference | | Data Sharing | Not supported | Hard-link style sharing | | Update Mechanism | Direct overwrite | Copy-on-Write | | Search Capability | Simple keyword | BM25 + Fuzzy | | Orphan Detection |理论上不存在孤儿观察 | Supported | | Cache Mechanism | None | 30s TTL | | Windows Compatibility | Unknown | Graceful degradation |

📚 Design Philosophy

What? You're still reading? Well, alright.

Honestly, this project started because:

LLM context is limited — can't stuff all knowledge into prompts
Filesystem is a great invention — handling "multiple data sharing same content" is mature
Humanities research has special needs — concepts, literature, citation relationships
Controllability > SOTA — no need for black-box vector models

So:

Borrow filesystem wisdom: inode table, hard links, copy-on-write
Search uses BM25 + Fuzzy: lightweight, explainable, transparent, controllable
Expose as tools: 16 MCP tools, LLM calls on demand

Result? — A quiet, efficient, unobtrusive knowledge management tool.

📄 License

Apache License 2.0

Manage knowledge the filesystem way—bringing order to chaos.