@aman_asmuei/aeval

v0.1.0

Published

2 months ago

The portable evaluation layer for AI companions

0High
0Medium
0Low

aman_asmuei

ai companion evaluation metrics relationship trust

aeval

The portable evaluation layer for AI companions. Track relationship quality over time — trust trajectory, session count, key milestones, and satisfaction signals. Data-driven relationship improvement.

The Ecosystem

aman
├── acore   →  identity     →  who your AI IS
├── amem    →  memory       →  what your AI KNOWS
├── akit    →  tools        →  what your AI CAN DO
├── aflow   →  workflows    →  HOW your AI works
├── arules  →  guardrails   →  what your AI WON'T do
└── aeval   →  evaluation   →  how GOOD your AI is

| Layer | Package | What it does | |:------|:--------|:-------------| | Identity | acore | Personality, values, relationship memory | | Memory | amem | Automated knowledge storage (MCP) | | Tools | akit | 15 portable AI tools (MCP + manual fallback) | | Workflows | aflow | Reusable AI workflows (code review, bug fix, etc.) | | Guardrails | arules | Safety boundaries and permissions | | Evaluation | aeval | Relationship tracking and session logging | | Unified | aman | One command to set up everything |

Each works independently. aman is the front door.

Install

npm install -g @aman_asmuei/aeval

Quick start

aeval init              # Create ~/.aeval/eval.md
aeval log               # Log a session (interactive)
aeval                   # Show current metrics
aeval report            # Full relationship report
aeval milestone "text"  # Record a milestone
aeval doctor            # Health check

How it works

aeval maintains a single markdown file (~/.aeval/eval.md) that tracks your AI relationship over time.

eval.md format

# AI Relationship Metrics

## Overview
- Sessions: 0
- First session: [not started]
- Trust level: 3/5
- Trajectory: building

## Timeline
<!-- Entries added automatically, newest first -->

## Milestones
- [none yet — milestones appear as your relationship grows]

## Patterns
- [observations about what works and what doesn't]

Logging sessions

aeval log walks you through 4 quick questions:

How was this session? — great / good / okay / frustrating
What went well? — optional text
What could improve? — optional text
Trust change? — increased / same / decreased

Each log updates your session count, adds a timeline entry, and recalculates trust and trajectory.

Relationship report

aeval report shows a summary of your AI relationship:

◆ aeval — relationship report

  Sessions:    12
  Since:       2026-03-15 (7 days)
  Trust:       4/5
  Trajectory:  building

  Recent sessions:
    2026-03-22  ★★★★★  great — productive debugging, AI caught edge case
    2026-03-21  ★★★★☆  good — solid feature work
    2026-03-20  ★★★☆☆  okay — some misunderstandings on requirements

  Milestones:
    2026-03-22  First time AI proactively suggested a better approach
    2026-03-18  Completed first full feature together

  Patterns:
    - AI works best when given clear requirements upfront
    - Debugging sessions build trust fastest

Rating scale

| Rating | Stars | |-------------|---------| | great | ★★★★★ | | good | ★★★★☆ | | okay | ★★★☆☆ | | frustrating | ★★☆☆☆ |

Trajectory

Trajectory is calculated from your recent session ratings:

building — average recent rating >= 3.5
stable — average recent rating >= 2.5
declining — average recent rating < 2.5

Philosophy

Single file — one markdown file, no database, no cloud
Portable — works anywhere, version-controllable
Honest — track what actually happens, not what you wish happened
Lightweight — 4 questions per session, done in 30 seconds

License

MIT