skillrl

v0.1.0

Published

5 months ago

Automatic skill distillation, retrieval, and evolution from coding agent trajectories. Implements SkillRL (arXiv:2602.08234v1) for Kiro, Claude Code, Cursor, and OpenClaw. Supports Gemini, Amazon Bedrock, and Claude providers.

skillrl

Automatic skill distillation, retrieval, and evolution from AI coding agent trajectories

Turn your AI agent's coding sessions into a reusable skill library that grows smarter over time. Implements the SkillRL framework for extracting, storing, retrieving, and evolving coding skills from agent trajectories. Works with Claude Code, Kiro, Cursor, and OpenClaw.

The Problem

AI coding agents solve the same types of problems repeatedly but start from scratch every session. They don't learn from past successes or failures. When an agent figures out how to debug a tricky async race condition or set up JWT auth correctly, that knowledge disappears the moment the session ends.

skillrl fixes this by building a persistent, searchable skill memory that any agent can query.

Features

Skill Distillation - Automatically extract reusable patterns from successful agent sessions
Failure Learning - Synthesize corrective skills from failed trajectories so agents don't repeat mistakes
Semantic Retrieval - Find relevant skills using embedding-based vector search (no API call per query)
Domain Filtering - Partition skills by domain (typescript, react, python, etc.) for precise retrieval
Skill Evolution - Continuously refine skills based on accumulated successes and failures
Multi-Format Ingestion - Parse Claude Code JSONL, Kiro logs, Cursor edits, or plain text trajectories
5 Export Formats - Kiro Power bundles, SKILL.md, .cursorrules, Markdown, or JSON
Multi-Provider Support - Gemini (default), Amazon Bedrock (50+ models), or Claude (Anthropic API)
MCP Integration - 8 MCP tools for seamless agent integration
SQLite Vector Index - Hardware-accelerated KNN search via sqlite-vec with WAL-mode concurrency
Deduplication - Automatic detection of duplicate or overlapping skills
Auto-Migration - Legacy JSON embedding indexes migrate to SQLite automatically

Installation

Global Installation (Recommended for CLI)

npm install -g skillrl

Local Installation (For programmatic use)

npm install skillrl

npx (No installation required)

npx skillrl stats

Quick Start

1. Configure Provider Credentials

Option A: Google Gemini (Default)

Get a free API key from Google AI Studio:

# Use the config command
skillrl config YOUR_GEMINI_API_KEY

# Or set environment variable
export GEMINI_API_KEY=your_api_key

# Or create .env file
echo "GEMINI_API_KEY=your_api_key" > .env

Option B: Amazon Bedrock

# Install the AWS SDK (required for Bedrock)
npm install @aws-sdk/client-bedrock-runtime

# Bedrock API Key (simplest)
export AWS_BEARER_TOKEN_BEDROCK=your_bedrock_api_key
export AWS_REGION=us-east-1

# Or AWS Access Keys
export AWS_ACCESS_KEY_ID=your_access_key
export AWS_SECRET_ACCESS_KEY=your_secret_key
export AWS_REGION=us-east-1

Option C: Claude (Anthropic API)

export ANTHROPIC_API_KEY=your_api_key

2. Initialize and Ingest Your First Trajectory

# Initialize an empty skill bank
skillrl init

# Ingest a Claude Code session log
skillrl ingest conversation.jsonl --source claude-code

# Or pipe from stdin
cat trajectory.json | skillrl ingest --stdin

# Build the embedding index for fast retrieval
skillrl index

3. Retrieve Skills

# Find skills relevant to a task
skillrl retrieve "implement JWT authentication with refresh tokens"

# Filter by domain
skillrl retrieve "fix race condition" --domain typescript

# List all skills
skillrl list

4. Export for Your IDE

# Export as Kiro Power bundle
skillrl export kiro-power --output ./power

# Export as SKILL.md for Claude Code
skillrl export skill-md --output SKILL.md

# Export as .cursorrules for Cursor
skillrl export cursorrules --output .cursorrules

CLI Reference

Commands

| Command | Description | |---------|-------------| | skillrl init | Initialize an empty skill bank | | skillrl ingest <file> | Parse a trajectory file and extract skills | | skillrl ingest --stdin | Read trajectory from stdin/pipe | | skillrl retrieve "<task>" | Find relevant skills for a task description | | skillrl evolve <file> | Run evolution cycle on failed trajectories | | skillrl export <format> | Export skill bank (5 formats) | | skillrl list | Display all skills with metadata | | skillrl stats | Show skill bank statistics | | skillrl index | Build/rebuild the embedding index | | skillrl import <file> | Merge skills from another bank JSON | | skillrl config [api-key] | Show or configure settings | | skillrl test | Test LLM provider connection |

Options

| Option | Description | |--------|-------------| | --provider, -p <name> | LLM provider: gemini (default), bedrock, or claude | | --model, -m <name> | Model to use (see Model Configuration) | | --bank-path <path> | Custom skill bank location (default: .skillrl/bank.json) | | --domain <domain> | Filter by domain (e.g., typescript, react, python) | | --source <type> | Trajectory source: claude-code, kiro, cursor, openclaw, custom | | --output, -o <path> | Output path for exports | | --verbose, -v | Show detailed output | | --help, -h | Show help | | --version, -V | Show version |

Examples

# Ingest a Kiro session log
skillrl ingest kiro-session.log --source kiro

# Ingest and specify the provider/model
skillrl ingest session.jsonl --provider bedrock --model smart

# Retrieve with custom threshold
skillrl retrieve "optimize database queries" --domain python

# List skills filtered by domain
skillrl list --domain typescript

# Export only high-confidence skills
skillrl export markdown --output skills.md --domain react

# Evolve from failed trajectories
skillrl evolve failures.json --provider gemini

# Import skills from a teammate's bank
skillrl import teammate-bank.json

# Test provider connectivity
skillrl test --provider bedrock

Supported Trajectory Formats

| Format | Description | Auto-Detected | |--------|-------------|---------------| | JSONL | Claude Code conversation logs (newline-delimited JSON) | Yes | | JSON | Structured trajectory object with task/steps/outcome | Yes | | Kiro | Section-delimited logs with --- separators | Yes | | Text | Unstructured agent logs | Yes |

MCP Server Integration

skillrl includes an MCP (Model Context Protocol) server for integration with AI coding assistants.

Setup with Claude Code

Add to your Claude Code MCP configuration (~/.claude.json or project .mcp.json):

{
  "mcpServers": {
    "skillrl": {
      "command": "npx",
      "args": ["-y", "skillrl-mcp"],
      "env": {
        "GEMINI_API_KEY": "your_api_key"
      }
    }
  }
}

Using Amazon Bedrock

{
  "mcpServers": {
    "skillrl": {
      "command": "npx",
      "args": ["-y", "skillrl-mcp"],
      "env": {
        "RLM_PROVIDER": "bedrock",
        "AWS_BEARER_TOKEN_BEDROCK": "your_bedrock_api_key",
        "AWS_REGION": "us-east-1"
      }
    }
  }
}

Available MCP Tools

| Tool | Description | |------|-------------| | skill_ingest | Parse trajectory and extract skills | | skill_retrieve | Find relevant skills for a task | | skill_evolve | Run evolution cycle from failures | | skill_export | Export skill bank to various formats | | skill_list | List skills with filters | | skill_bank_stats | Get skill bank statistics | | skill_config | View current configuration | | skill_index | Build/rebuild embedding index |

Example MCP Usage

Once configured, your agent can query skills automatically:

Use skill_retrieve to find skills for "implement WebSocket real-time updates"

The agent receives ranked skills with relevance scores, instructions, examples, and anti-patterns - all injected into its context for the current task.

Programmatic API

Factory Function (Recommended)

import { createSkillManager } from 'skillrl';

const manager = createSkillManager({
  provider: 'gemini',
  bankPath: '.skillrl/bank.json',
});

// Ingest a trajectory
const result = await manager.distiller.distill(trajectory);
console.log(`Extracted ${result.skills.length} skills`);

// Retrieve relevant skills
const retrieval = await manager.retriever.retrieve(
  'implement OAuth2 with PKCE flow',
  'typescript'
);
for (const { skill, relevanceScore } of retrieval.skills) {
  console.log(`${skill.name} (${(relevanceScore * 100).toFixed(0)}%)`);
}

// Evolve from failures
const evolution = await manager.evolver.evolve(failedTrajectories);
console.log(`New: ${evolution.newSkills.length}, Refined: ${evolution.refinedSkills.length}`);

Direct Class Usage

import {
  SkillDistiller,
  SkillBankManager,
  SkillRetriever,
  SkillEvolver,
  EmbeddingManager,
  getDefaultSkillConfig,
} from 'skillrl';

const config = getDefaultSkillConfig();

// Manage the skill bank
const bank = new SkillBankManager(config);
await bank.load();
console.log(`Skills: ${bank.getStats().totalSkills}`);

// Build embedding index
const embeddings = new EmbeddingManager(config);
const skills = bank.listSkills();
const { indexed, skipped } = await embeddings.indexSkills(skills);
console.log(`Indexed: ${indexed}, Skipped: ${skipped}`);

// Search by vector similarity
const results = await embeddings.search('handle authentication errors', {
  topK: 5,
  threshold: 0.3,
  domain: 'typescript',
});

// Clean up
embeddings.close();

Embedding Manager

import { EmbeddingManager } from 'skillrl';

const em = new EmbeddingManager(config);

// Check if index exists
if (await em.hasIndex()) {
  // Semantic search with domain filtering
  const results = await em.search('implement caching layer', {
    topK: 10,
    threshold: 0.4,
    domain: 'python',
  });

  for (const { skillId, score } of results) {
    console.log(`${skillId}: ${(score * 100).toFixed(1)}% match`);
  }
}

// Get index statistics
const stats = await em.getStats();
console.log(`${stats.totalEmbeddings} embeddings, ${(stats.indexSizeBytes / 1024).toFixed(1)} KB`);

em.close();

Export Skills

import { getExporter } from 'skillrl';

const exporter = getExporter('kiro-power');
const result = await exporter.export(bank.getBank(), {
  format: 'kiro-power',
  outputPath: './power',
  domain: 'typescript',
  minConfidence: 0.6,
});

console.log(`Exported ${result.skillCount} skills to ${result.outputPath}`);

Model Configuration

Available Models

Gemini (Default Provider)

| Alias | Model ID | Description | |-------|----------|-------------| | fast, default | gemini-3-flash-preview | Fast and efficient (recommended) | | smart, pro | gemini-3-pro-preview | Most capable |

Amazon Bedrock

| Alias | Model ID | Description | |-------|----------|-------------| | fast, default | us.amazon.nova-2-lite-v1:0 | Nova 2 Lite | | smart | us.anthropic.claude-sonnet-4-5-* | Claude 4.5 Sonnet | | claude-sonnet | us.anthropic.claude-sonnet-4-5-* | Claude 4.5 Sonnet | | claude-opus | us.anthropic.claude-opus-4-5-* | Claude 4.5 Opus | | qwen3-coder | qwen.qwen3-coder-30b-* | Qwen3 Coder | | llama-4 | us.meta.llama4-maverick-* | Llama 4 |

Claude (Anthropic API)

| Alias | Model ID | Description | |-------|----------|-------------| | fast, haiku | claude-haiku-4-5-20251001 | Fast and cost-effective | | smart, sonnet | claude-sonnet-4-5-20250929 | Balanced | | opus | claude-opus-4-6 | Most capable |

Using Model Aliases

# Fast distillation
skillrl ingest session.jsonl --model fast

# High-quality skill extraction
skillrl ingest session.jsonl --model smart

# Use Bedrock with Claude
skillrl ingest session.jsonl --provider bedrock --model claude-sonnet

Environment Variables

# Default provider
export RLM_PROVIDER=gemini

# Gemini
export GEMINI_API_KEY=your_api_key

# Bedrock
export AWS_BEARER_TOKEN_BEDROCK=your_bedrock_api_key
export AWS_REGION=us-east-1

# Claude
export ANTHROPIC_API_KEY=your_api_key

How It Works

skillrl implements the three-phase SkillRL framework from the research paper:

                    Agent Trajectories
                   (Claude Code, Kiro, Cursor, etc.)
                           |
            +--------------+--------------+
            |                             |
     Success Trajectories          Failed Trajectories
            |                             |
            v                             v
   +------------------+        +------------------+
   |  Phase 1:        |        |  Phase 1:        |
   |  DISTILLATION    |        |  DISTILLATION    |
   |  Extract reusable|        |  Synthesize      |
   |  patterns        |        |  corrective      |
   |  (Section 3.1)   |        |  skills          |
   +--------+---------+        +--------+---------+
            |                             |
            +----------+    +-------------+
                       |    |
                       v    v
              +------------------+
              |  SKILL BANK      |
              |  Persistent JSON |
              |  Sg (general)    |
              |  Sk (domain)     |
              +--------+---------+
                       |
          +------------+------------+
          |                         |
          v                         v
 +------------------+     +------------------+
 |  Phase 2:        |     |  Phase 3:        |
 |  RETRIEVAL       |     |  EVOLUTION       |
 |  Semantic search |     |  Refine skills   |
 |  + LLM ranking   |     |  from failures   |
 |  (Section 3.2)   |     |  (Section 3.3)   |
 +--------+---------+     +--------+---------+
          |                         |
          v                         |
 +------------------+               |
 |  Agent Context   | <-------------+
 |  Skills injected |
 |  into the next   |
 |  coding session  |
 +------------------+

Phase 1: Skill Distillation (Section 3.1)

When you ingest a trajectory, the LLM analyzes it to extract reusable patterns:

Successful trajectories produce skills with step-by-step instructions, examples, and best practices
Failed trajectories produce corrective skills documenting what went wrong and how to avoid it
Auto-detection identifies the domain (typescript, react, python, etc.) from the trajectory content
Deduplication checks against existing skills to prevent redundancy

Phase 2: Skill Retrieval (Section 3.2)

When an agent needs skills for a task, retrieval uses a two-tier approach:

Fast path (embedding-based): Pre-built vector index using Gemini embeddings + sqlite-vec KNN search. One API call for the query embedding, then pure vector similarity. Domain partition keys enable pre-filtered search.
Fallback path (keyword + LLM): Keyword matching on skill name/description/tags, then LLM-based reranking for ambiguous cases. Used when no embedding index exists.

Phase 3: Skill Evolution (Section 3.3)

After accumulating failures, the evolution cycle:

Analyzes patterns across failed trajectories to identify systemic issues
Creates new skills to address gaps in the current skill set
Refines existing skills with updated instructions, confidence scores, or anti-patterns
Deprecates ineffective skills that consistently lead to poor outcomes

Skill Organization

Skills are organized using the paper's dual-pool architecture:

Sg (General Skills) - Broadly applicable patterns (e.g., "Incremental Verification Loop", "Hypothesis-Driven Debugging")
Sk (Task-Specific Skills) - Domain-bound techniques (e.g., "React State Management", "SQL Query Optimization")

Embedding Index

The embedding index uses SQLite + sqlite-vec for scalable vector search:

| Feature | Description | |---------|-------------| | Model | Gemini gemini-embedding-001 (256 dimensions) | | Storage | SQLite DB with WAL mode for concurrent access | | Search | vec0 virtual table with cosine distance metric | | Filtering | Domain partition key for pre-filtered KNN | | Migration | Auto-migrates from legacy JSON format |

Export Formats

Kiro Power (`kiro-power`)

Full Kiro IDE integration bundle:

skillrl export kiro-power --output ./power

Generates:

power/
  POWER.md           # Activation manifest
  mcp.json           # MCP server configuration
  steering/          # Domain-grouped skill files
  hooks/             # Agent lifecycle hooks

SKILL.md (`skill-md`)

Claude Code compatible format with YAML frontmatter:

skillrl export skill-md --output SKILL.md

.cursorrules (`cursorrules`)

Native Cursor IDE rules format:

skillrl export cursorrules --output .cursorrules

Markdown (`markdown`)

Human-readable documentation with table of contents:

skillrl export markdown --output skills-reference.md

JSON (`json`)

Raw skill bank for programmatic use:

skillrl export json --output skills-backup.json

Configuration

Credentials Storage

Your credentials are checked in this order:

Environment variables (GEMINI_API_KEY, AWS_BEARER_TOKEN_BEDROCK, ANTHROPIC_API_KEY)
.env file in current directory
.env.local file in current directory
~/.skillrl/.env
~/.config/skillrl/.env
~/.skillrl/config.json
~/.config/skillrl/config.json

Skill Bank Location

Default: .skillrl/bank.json in the current directory.

Override with --bank-path:

skillrl list --bank-path ~/shared-skills/bank.json

File Structure

.skillrl/
  bank.json              # Skill definitions and metadata
  embeddings.db          # SQLite vector index (auto-created)
  embeddings.db-wal      # WAL journal (auto-managed)
  embeddings.db-shm      # Shared memory (auto-managed)
  embeddings.json.migrated   # Legacy index backup (if migrated)

Troubleshooting

"API key not configured"

# Check current config
skillrl config

# Set your Gemini key
skillrl config YOUR_API_KEY

"No skills found" on retrieve

Make sure you've:

Ingested at least one trajectory (skillrl ingest)
Built the embedding index (skillrl index)

skillrl stats    # Check if skills exist
skillrl index    # Rebuild the index

"Amazon Bedrock provider requires @aws-sdk/client-bedrock-runtime"

npm install @aws-sdk/client-bedrock-runtime

Embedding index is large

The SQLite database grows with the number of skills. For a bank of 500 skills with 256-dimension embeddings, expect ~2-5 MB. The WAL and SHM files are temporary and auto-cleaned.

Slow first retrieval

The first call to search() opens the SQLite database and loads the sqlite-vec extension. Subsequent calls reuse the connection. Typical overhead: ~5-20ms.

Migration from JSON embeddings

If you have an existing embeddings.json file, it will be automatically migrated to SQLite on the next load(), search(), or index call. The original file is preserved as embeddings.json.migrated.

TypeScript Types

All types are exported for TypeScript users:

import type {
  // Core types
  Skill,
  SkillExample,
  SkillMetadata,
  SkillBank,
  SkillBankMetadata,
  SkillConfig,

  // Operation results
  DistillationResult,
  RetrievalResult,
  EvolutionResult,
  ExportResult,
  ScoredSkill,

  // Trajectories
  Trajectory,
  TrajectoryStep,
  ToolCall,

  // Configuration
  ExportOptions,

  // Embeddings
  SkillEmbedding,
  EmbeddingIndex,

  // Providers
  ProviderName,
  ResolvedModelConfig,
} from 'skillrl';

Sub-Path Exports

// Model configuration utilities
import { resolveModelConfig, getAvailableModelsForProvider } from 'skillrl/models';

// Configuration management
import { getApiKey, detectProvider, initializeProvider } from 'skillrl/config';

// Provider types
import type { LLMProvider, GenerateOptions, GenerateResponse } from 'skillrl/providers';

Security

API keys are stored locally and never transmitted except to configured LLM providers
Path traversal protection prevents skill bank reads/writes outside the sandbox directory
Input validation via Zod schemas on all MCP tool arguments with length limits
Output sanitization escapes markdown/YAML injection in exported skill content
Read-only analysis - trajectory ingestion never modifies your source code

License

MIT

Credits

Based on research:

SkillRL: Skill-Based Transferable Reinforcement Learning for LLM Agents (arXiv:2602.08234v1)

Part of the RLM project family.

Contributing

Contributions welcome! Please open an issue or submit a pull request.

Support

Issues: GitHub Issues
Discussions: GitHub Discussions

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

skillrl

The Problem

Features

Table of Contents

Installation

Global Installation (Recommended for CLI)

Local Installation (For programmatic use)

npx (No installation required)

Quick Start

1. Configure Provider Credentials

Option A: Google Gemini (Default)

Option B: Amazon Bedrock

Option C: Claude (Anthropic API)

2. Initialize and Ingest Your First Trajectory

3. Retrieve Skills

4. Export for Your IDE

CLI Reference

Commands

Options

Examples

Supported Trajectory Formats

MCP Server Integration

Setup with Claude Code

Using Amazon Bedrock

Available MCP Tools

Example MCP Usage

Programmatic API

Factory Function (Recommended)

Direct Class Usage

Embedding Manager

Export Skills

Model Configuration

Available Models

Gemini (Default Provider)

Amazon Bedrock

Claude (Anthropic API)

Using Model Aliases

Environment Variables

How It Works

Phase 1: Skill Distillation (Section 3.1)

Phase 2: Skill Retrieval (Section 3.2)

Phase 3: Skill Evolution (Section 3.3)

Skill Organization

Embedding Index

Export Formats

Kiro Power (kiro-power)

SKILL.md (skill-md)

.cursorrules (cursorrules)

Markdown (markdown)

JSON (json)

Configuration

Credentials Storage

Skill Bank Location

File Structure

Troubleshooting

"API key not configured"

"No skills found" on retrieve

"Amazon Bedrock provider requires @aws-sdk/client-bedrock-runtime"

Embedding index is large

Slow first retrieval

Migration from JSON embeddings

TypeScript Types

Sub-Path Exports

Security

License

Credits

Contributing

Support

Kiro Power (`kiro-power`)

SKILL.md (`skill-md`)

.cursorrules (`cursorrules`)

Markdown (`markdown`)

JSON (`json`)