@caplab/read
v0.1.0
Published
Read-only workspace access package for LLM agents with deterministic path handling and budget-aware file operations
Maintainers
Readme
@caplab/read
A read-only workspace access package for LLM agents and developer tooling with deterministic path handling, budget-aware file operations, and encoding-safe text processing.
Why This Package Exists
AI agents and developer tools need safe, predictable file system access. Traditional file APIs can leak outside workspace boundaries, produce unbounded output, or fail silently on encoding issues. @caplab/read provides a controlled, read-only interface with explicit budgets and deterministic behavior.
Problems It Solves
- Workspace boundary enforcement: Prevents reading files outside the designated workspace root
- Output budgeting: Limits bytes, lines, and file counts to prevent unbounded responses
- Encoding safety: Auto-detects and handles UTF-8, UTF-8 BOM, UTF-16LE, UTF-16BE without silent failures
- Binary detection: Identifies binary files and handles them predictably
- Deterministic ordering: Same inputs produce the same outputs across runs
- Symlink safety: Resolves symlinks to prevent boundary escape attacks
What It Intentionally Does Not Do
- Write or modify any files (strictly read-only)
- Execute code or commands
- Provide write operations or file system mutations
- Support arbitrary path traversal outside workspace root
- Handle encodings outside UTF-8/UTF-16 family
Installation
npm install @caplab/readESM-Only Usage
This package is ESM-only. Use import syntax:
import { createWorkspaceReader } from "@caplab/read";Quick Start
import { createWorkspaceReader } from "@caplab/read";
const reader = await createWorkspaceReader({
workspaceRoot: "./my-project",
});
// List directory
const dirResult = await reader.listDirectory("src");
console.log(dirResult);
// Read file
const fileResult = await reader.readFile("src/index.ts");
console.log(fileResult);
// Search files
for await (const match of reader.fileSearch("function")) {
console.log(match);
}API Overview
createWorkspaceReader(options)
Creates a configured workspace reader instance. Throws an error if workspaceRoot is invalid (does not exist or is not a directory).
interface WorkspaceReaderOptions {
workspaceRoot: string; // Required absolute path
maxBytes?: number; // Default: 262144 (256 KiB)
maxLines?: number; // Default: 2000
maxFiles?: number; // Default: 20
maxTotalBytes?: number; // Default: 1048576 (1 MiB)
maxEntries?: number; // Default: 1000
maxSearchResults?: number; // Default: 200
maxDepth?: number; // Default: 10
}listDirectory(path, options)
Lists files and directories with deterministic ordering. Returns absolute path and normalized relativePath for each entry.
interface ListDirectoryOptions {
maxDepth?: number; // Default: unlimited
includeHidden?: boolean; // Default: false
include?: string[]; // Glob patterns
exclude?: string[]; // Glob patterns
maxEntries?: number; // Default: 1000
}readFile(path, options)
Reads a single file with byte and line budgets. Returns absolute path and normalized relativePath. Supported encodings: UTF-8, UTF-8 BOM, UTF-16LE, UTF-16BE.
interface ReadFileOptions {
maxBytes?: number; // Default: 262144
maxLines?: number; // Default: 2000
encoding?: "utf-8" | "utf-16le" | "utf-16be" | "auto"; // Default: 'auto'
}readMultipleFiles(paths, options)
Reads multiple files with bounded parallelism and deterministic admission. Returns results with absolute path and normalized relativePath for each file.
interface ReadMultipleFilesOptions {
maxFiles?: number; // Default: 20
maxBytes?: number; // Default: 262144 (per-file)
maxTotalBytes?: number; // Default: 1048576
maxConcurrency?: number; // Default: 5
}fileSearch(query, options)
Search files using an adapter over @caplab/grep-search. This is not a second search engine—it delegates search behavior to @caplab/grep-search with workspace boundary enforcement and budget limits.
interface FileSearchOptions {
regex?: boolean;
wholeWord?: boolean;
caseSensitive?: boolean;
multiline?: boolean;
extensions?: string[];
ignore?: string[];
maxDepth?: number;
beforeContext?: number;
afterContext?: number;
maxResults?: number; // Default: 200
}Default Budgets and Limits
readFile.maxBytes: 262144 (256 KiB)readFile.maxLines: 2000readMultipleFiles.maxFiles: 20readMultipleFiles.maxTotalBytes: 1048576 (1 MiB)listDirectory.maxEntries: 1000fileSearch.maxResults: 200
Error Model
Expected operational failures return structured result objects with explicit error codes:
file_not_found: File does not existpermission_denied: Insufficient permissionsbinary_file: File detected as binaryunsupported_encoding: Encoding not supportedpath_outside_workspace: Path outside workspace rootskipped_due_to_budget: Skipped due to maxTotalBytes exhaustionskipped_due_to_max_files: Skipped due to maxFiles limit
Only invalid API usage or invalid configuration throws exceptions.
Hard Guarantees / Invariants
- Read-only guarantee: The package never mutates workspace state
- Deterministic ordering: Same input produces same output across runs
- Fail-closed boundaries: Symlink resolution prevents workspace escape
- Encoding safety: Text decoding is explicit and predictable
- Output budgeting: Large files and directory trees don't create unbounded output
- Binary safety: Binary files are detected and handled predictably
- Path consistency: All operations return absolute
pathand normalizedrelativePath
Relationship to @caplab/grep-search
fileSearch is an adapter over @caplab/grep-search. It:
- Delegates search behavior to
@caplab/grep-search - Enforces workspace boundary with
cwdparameter - Applies default
maxResultsbudget - Forwards search options (regex, caseSensitive, etc.) via allowlist
- Does not implement a separate search engine
Node Version Requirement
Node.js >= 20.0.0
License
MIT License - see LICENSE file for details
