agentmine

v0.2.0

Published

7 days ago

Queryable work corpus of coding-agent sessions

0High
0Medium
0Low

baranovxyz

coding-agent agent-sessions session-transcripts sqlite corpus claude-code codex opencode cursor cli

Agentmine

Agentmine turns local AI coding-agent session transcript archives into a queryable SQLite corpus.

It ingests session transcripts from tools such as Claude Code, Cursor, Codex, and opencode, normalizes them into a shared schema, extracts useful facts, and exposes the result through an agent-friendly JSON CLI.

Use it to answer questions like:

What files and commands do my agents touch most?
Which failed commands or tool errors repeat?
What corrections do I keep giving agents?
Have I solved a similar task before?
Which skills, MCP tools, or agent workflows are actually used?

Agentmine is local-first. It reads local transcript stores and writes local SQLite data under the user data directory by default. It does not call an LLM in the default sync -> normalize -> extract path.

Status

First public release. The core local corpus workflow is working and covered by tests. The README is intentionally concise for the initial release and will grow with public usage.

Requirements

Node.js 22+
pnpm
SQLite support through better-sqlite3

Optional:

Ollama with nomic-embed-text for local semantic search

Install

pnpm install
pnpm build
alias agentmine="node $PWD/dist/cli.js"

Quick Start

agentmine backup
agentmine sync
agentmine normalize
agentmine extract
agentmine stats

Or run the default pipeline:

agentmine ingest

The pipeline is designed to be safe to rerun:

sync mirrors known local transcript stores into Agentmine's session data directory.
normalize parses transcripts into canonical sessions and skips unchanged content by hash.
extract rebuilds derived fact tables in transactions.
backup snapshots sessions.db before risky rebuilds.

For incremental imports:

agentmine normalize --since 1d
agentmine normalize --since 2026-06-01

Common Commands

agentmine stats
agentmine sessions --limit 20
agentmine session <session-id> --md
agentmine fts "error text or phrase"
agentmine similar "task description"
agentmine top files --limit 20
agentmine top commands --failed --limit 20
agentmine top corrections --by kind
agentmine top skills
agentmine timeline --bucket week
agentmine schema

Ad-hoc SQL is read-only:

agentmine query "SELECT source, count(*) AS n FROM sessions GROUP BY source"

Only SELECT, WITH, and EXPLAIN queries are accepted.

Similarity Search

agentmine similar is the main entry point for finding prior work:

agentmine similar "React Router auth redirect loop" --limit 5
agentmine similar "schema migration" --source codex
agentmine similar "test flake timeout" --project /path/to/repo

By default, similar runs in auto mode:

It uses FTS when no local embedding index is available.
It can use hybrid search when local embeddings exist and guardrails are satisfied.
It returns reconstruction commands such as agentmine session <id> --md.

Local embeddings are optional:

ollama pull nomic-embed-text
agentmine embed --provider ollama --model nomic-embed-text --dry-run
agentmine embed --provider ollama --model nomic-embed-text --limit 500
agentmine similar "agent first CLI JSON stdout stderr" --mode hybrid

Data Paths

Default paths live under Agentmine's user data directory:

macOS/Linux: $XDG_DATA_HOME/agentmine/sessions/ when set, otherwise ~/.local/share/agentmine/sessions/.
Windows: %APPDATA%\agentmine\sessions\.

| Path | Purpose | |---|---| | <sessions>/claude-code/ | mirrored Claude Code transcripts | | <sessions>/cursor/ | mirrored Cursor transcripts | | <sessions>/codex/ | mirrored Codex sessions | | <sessions>/opencode/ | mirrored opencode sessions | | <sessions>/sessions.db | SQLite corpus | | <sessions>/backups/ | backup archives |

Override the database path with:

AGENTMINE_DB=/path/to/sessions.db agentmine stats

Redaction

Normalization redacts high-confidence secret patterns before storing searchable text. Built-in patterns cover common API keys, bearer tokens, private keys, OAuth-style tokens, Slack tokens, AWS access key IDs, GitHub token prefixes, and secret-shaped environment values.

Use --no-redact only for a deliberate local audit where preserving exact text is required.

Extensions

Create ~/.config/agentmine/extensions.js to add custom sources, redaction rules, or an LLM proxy without changing this repository.

export default {
  adapters: [],
  redactPatterns: [
    { name: "custom-token", pattern: /CUSTOM_TOKEN_[A-Z0-9]+/g },
  ],
  llmBaseUrl: "https://proxy.example.com",
};

Extension files are user-private and should not be committed.

Agent-Friendly CLI Contract

Agentmine is built for agents and automation:

stdout is one JSON envelope.
progress and warnings go to stderr.
errors include stable codes and retry guidance.
commands are non-interactive by default.
schema discovery is available through agentmine schema.

Development

pnpm typecheck
pnpm test

License

MIT. See LICENSE.