@manet/ltr

v0.0.4

Published

a month ago

A Long Text Retrieval Agent for searching and analyzing markdown documents

0High
0Medium
0Low

ulivz

manet-agent agent-infra agent typescript library retrieval markdown

Long Text Retrieval Agent

A specialized agent for searching and analyzing long markdown documents with outline-based navigation and intelligent content retrieval.

Features

📚 Outline Navigation: Automatically extracts TOC structure from Markdown files
🔍 Intelligent Search: Progressive search based on document structure
🛠️ Tool Integration: Uses run_command and run_script for flexible file searching
📝 Type Safety: Strict TypeScript implementation
⚡ Error Handling: Graceful handling of file errors and missing content

Installation

pnpm add @manet/ltr

Quick Start

manet

Configuration

Direct Configuration

Specify files and intro directly in the options:

const agent = new LongTextRetrievalAgent({
  ltr: {
    intro: 'I am an intelligent assistant for searching and analyzing long text documents.',
    files: [
      {
        name: 'React Documentation',
        description: 'Complete React.js documentation with hooks, components, and best practices',
        filepath: '/path/to/react-docs.md'
      },
      {
        name: 'API Design Guide',
        description: 'REST API design patterns and conventions',
        filepath: '/path/to/api-design.md'
      }
    ]
  }
});

Directory-based Configuration

Use a directory with config.json:

my-docs/
├── config.json
├── introduction.md
├── core-framework.md
└── ui-components.md

config.json:

{
  "intro": "introduction.md",
  "files": [
    {
      "name": "Core Framework",
      "description": "Modern application development framework",
      "filepath": "core-framework.md"
    },
    {
      "name": "UI Components",
      "description": "Comprehensive UI component library",
      "filepath": "ui-components.md"
    }
  ]
}

const agent = new LongTextRetrievalAgent({
  ltr: {
    dir: '/path/to/my-docs'
  }
});

Markdown Import Support

When using directory-based configuration, markdown files can include import statements to include content from other files:

# Technical Documentation

This framework is a development framework for building modern applications with complete application development, build, and deployment capabilities.

## How It Works

Based on modern rendering engines and component systems, providing application runtime environment. Communication between applications and hosts is achieved through bridge layers. Developers can quickly create, build, and publish applications using the CLI tools provided by the framework.

## Technology Stack

### Core Framework

import './core-framework.md'

### UI Components

import './ui-components.md'

### Development Language

import './dev-language.md'

Import Syntax:

import './filename.md' - Import from same directory
import '../filename.md' - Import from parent directory
import './subdir/filename.md' - Import from subdirectory
import "./filename.md" - Double quotes also supported

Features:

✅ Recursive import processing
✅ Circular import detection
✅ Missing file error handling
✅ Preserves file content formatting

Usage

import { LongTextRetrievalAgent } from '@manet/ltr';

// Direct configuration
const agent = new LongTextRetrievalAgent({
  ltr: {
    intro: 'I am an intelligent assistant for searching and analyzing long text documents.',
    files: [
      {
        name: 'React Documentation',
        description: 'Complete React.js documentation with hooks, components, and best practices',
        filepath: '/path/to/react-docs.md'
      },
      {
        name: 'API Design Guide',
        description: 'REST API design patterns and conventions',
        filepath: '/path/to/api-design.md'
      }
    ]
  }
});

// Directory-based configuration
const dirAgent = new LongTextRetrievalAgent({
  ltr: {
    dir: '/path/to/config-directory'  // Contains config.json and files
  }
});

// Use the agent to search documents
const response = await agent.run('How to implement user authentication?');
console.log(response);

Search Strategy

Agent uses a structured search approach:

Structure Analysis: First analyzes file outlines to understand document structure
Section Location: Based on user query, determines the most likely sections containing relevant information
Initial Search: Uses grep or other text search tools for preliminary search
Precise Location: Uses more precise location tools based on search results
Content Extraction: Extracts and organizes found content
Range Expansion: If needed, can further expand search scope

Available Tools

run_command: Execute system commands for file searching
run_script: Run scripts for complex search operations

Search Tips

Use grep for keyword search: grep -n "keyword" file_path
Use sed to extract specific line ranges: sed -n 'start_line,end_linep' file_path
Use awk for pattern matching and extraction
Combine multiple commands for precise search

API Reference

LongTextRetrievalAgentOptions

interface LongTextRetrievalAgentOptions {
  ltr: LtrOptions;         // LTR namespace options
}

interface LtrOptions {
  intro?: string;          // Basic introduction for search context
  files?: FileInfo[];      // Array of markdown files to search
  dir?: string;           // Directory containing config.json and files
}

interface FileInfo {
  name: string;            // File identifier
  description: string;     // Content description
  filepath: string;        // Absolute file path
}

// Directory-based config.json structure
interface LtrConfigFile {
  intro: string;           // Path to intro file (relative to dir)
  files: {
    name: string;
    description: string;
    filepath: string;      // Relative to dir
  }[];
}

// Markdown import processor
interface MarkdownImportProcessor {
  static processImports(content: string, basePath: string): string;
  static loadMarkdownWithImports(filePath: string): string;
  static hasImports(content: string): boolean;
}

Methods

getFileOutlines(): Get current file outlines
refreshFileOutlines(files?): Refresh outlines from files

Architecture

Agent follows a structured approach:

Initialization: Extract outlines from all provided markdown files
System Prompt: Inject file outlines with line numbers into the system prompt
Search Strategy: Progressive search from structure to specific content
Tool Usage: Leverage command-line tools for efficient text search

File Structure

agents/ltr/
├── src/
│   ├── index.ts              # Main entry file
│   ├── types.ts              # Type definitions
│   ├── outline-extractor.ts  # Outline extractor
│   ├── config-loader.ts      # Configuration loader
│   ├── markdown-import-processor.ts # Markdown import processor
│   └── system-prompt.ts      # System prompt generator
├── tests/                   # Test files
├── examples/                # Example code
└── dist/                    # Build output