@more-ink/irt-core

v1.2.2

Published

8 days ago

Streaming IRT-like adaptive practice engine in TypeScript with online updates, heuristic SE, and CAT-ish selection.

0High
0Medium
0Low

paraself

IRT adaptive TypeScript streaming CAT

Streaming IRT-like Adaptive Practice Engine (TypeScript)

A stateless TypeScript library for adaptive practice. It tracks per-user ability (θ) and per-item parameters (a, b), supports partial credit, updates everything online with learning rates, estimates a heuristic SE from accumulated information (post-update θ), and selects the next item using information, difficulty proximity, recency, and exposure controls.

Monorepo Layout

packages/irt-core: the TypeScript library (core math + async engine).
Root scripts are workspace-aware (npm); SPEC.md remains the authoritative behavioral doc.

What’s Inside

2PL-style logistic model with scores in [0,1] (soft labels).
Streaming gradient updates for θ, a, b, with clamping on discrimination.
Fisher information + heuristic SE (not a formal CI).
CAT-ish selector: difficulty penalty, recency gap, exposure cap, top-K randomization, exploration probability.
Two layers:
- Core (packages/irt-core/src/core.ts): pure sync math/types.
- Engine (packages/irt-core/src/engine.ts): async orchestration with host-provided repos and optional hooks.

Install

npm install @more-ink/irt-core

Type-Safe Skill Identifiers (NEW in v0.1.1)

All interfaces and functions now support generic type parameters for skill identifiers, enabling compile-time type safety:

// Define your skill taxonomy
type MySkills = 'grammar' | 'vocabulary' | 'listening' | 'speaking';

// TypeScript enforces valid skill IDs
const user: UserSkillState<MySkills> = { 
  userId: 'u1', 
  skillId: 'grammar', // ✅ Type-safe
  theta: 0, 
  infoSum: 0 
};

// Compile error on invalid skills
// skillId: 'invalid' // ❌ Error: not assignable to MySkills

Backward compatible: All existing code works without changes (generic defaults to string).

See GENERICS.md for migration guide and examples/typed-skills.ts for complete examples.

Core Usage (pure)

import {
  updateUserAndItemOnline,
  chooseNextItem,
  seFromInfo,
  ItemWithMetadata,
  UserSkillState,
} from '@more-ink/irt-core';

// Basic usage with default string skillId
const user: UserSkillState = { userId: 'u1', skillId: 'grammar', theta: 0, infoSum: 0 };
const item: ItemWithMetadata = { id: 'i1', skillId: 'grammar', a: 1, b: 0 };

// Type-safe usage with custom skill type
type MySkills = 'grammar' | 'vocabulary' | 'listening' | 'speaking';
const typedUser: UserSkillState<MySkills> = { userId: 'u2', skillId: 'vocabulary', theta: 0, infoSum: 0 };
const typedItem: ItemWithMetadata<MySkills> = { id: 'i2', skillId: 'vocabulary', a: 1, b: 0 };

const { user: u2, item: i2 } = updateUserAndItemOnline(user, item, 0.8, {
  thetaLR: 0.05,
  aLR: 0.001,
  bLR: 0.01,
});

// With typed skills, TypeScript ensures skillId safety
const { user: u3, item: i3 } = updateUserAndItemOnline<MySkills>(typedUser, typedItem, 0.9);

const se = seFromInfo(u2.infoSum);
const next = chooseNextItem(u2, [i2], {
  minGapMs: 10 * 60_000,
  difficultyPenaltyWidth: 1,
  maxTimesSeen: 5,
  topKRandomize: 3,
  explorationChance: 0.1,
});

Engine Usage (async orchestration)

import { createStreamingIrtEngine } from '@more-ink/irt-core';

// Create engine with default string skillId
const engine = createStreamingIrtEngine({
  userRepo, // implements getUserSkillState/saveUserSkillState
  itemRepo, // implements getItem/saveItem/listCandidateItems
  config: {
    updateDefaults: { thetaLR: 0.05, aLR: 0.001, bLR: 0.01, minA: 0.2, maxA: 3.0 },
    selectionDefaults: { minGapMs: 5 * 60_000, topKRandomize: 5, explorationChance: 0.1 },
  },
});

const result = await engine.recordResponseAndSelectNext({
  userId: 'u1',
  skillId: 'grammar',
  itemId: 'i1',
  score: 0.8,
});
// => { theta, se, updatedUser, updatedItem, nextItem }

// Multi-skill items: one interaction that updates several skills at once
const multi = await engine.recordMultiSkillResponses({
  userId: 'u1',
  itemId: 'passage-5',
  skillScores: [
    { skillId: 'grammar', score: 0.85 },
    { skillId: 'writing', score: 0.55 },
  ],
});
// => [{ skillId: 'grammar', theta, se, updatedUser, updatedItem }, ...]

// Type-safe engine with custom skill identifiers
type AppSkills = 'math' | 'reading' | 'science';
const typedEngine = createStreamingIrtEngine<AppSkills>({
  userRepo, // UserSkillStateRepo<AppSkills>
  itemRepo, // ItemRepo<AppSkills>
});

const typedResult = await typedEngine.recordResponseAndSelectNext({
  userId: 'u1',
  skillId: 'math', // TypeScript enforces this must be 'math' | 'reading' | 'science'
  itemId: 'i1',
  score: 0.9,
});

// Retrieve user state with computed standard error
const userState = await engine.getUserState({ userId: 'u1', skillId: 'grammar' });
if (userState) {
  console.log(`User theta: ${userState.user.theta}, SE: ${userState.se}`);
}

// Retrieve item state
const itemState = await engine.getItemState({ itemId: 'i1', skillId: 'grammar' });
if (itemState) {
  console.log(`Item difficulty: ${itemState.b}, times seen: ${itemState.timesSeen}`);
}

// Retrieve all skill states for a user (batch operation)
const allUserSkills = await engine.getUserStates({ userId: 'u1' });
console.log(`User has ${allUserSkills.length} skills`);
allUserSkills.forEach(({ user, se }) => {
  console.log(`${user.skillId}: theta=${user.theta}, SE=${se}`);
});

// Retrieve all instances of an item across skills (batch operation)
const allItemInstances = await engine.getItemStates({ itemId: 'i1' });
console.log(`Item exists in ${allItemInstances.length} skills`);
allItemInstances.forEach((item) => {
  console.log(`${item.skillId}: a=${item.a}, b=${item.b}`);
});

API Methods

The engine provides these methods:

Single Entity Operations:

recordResponseAndSelectNext() - Ingest response, update user/item, and select next item
recordResponse() - Ingest response and update user/item (no selection)
recordMultiSkillResponses() - Ingest multi-skill scores for the same item and return per-skill updates
selectNextItem() - Select next item without recording a response
getUserState() - Retrieve single user skill state with computed SE (returns null if doesn't exist)
getItemState() - Retrieve single item parameters for a skill (returns null if doesn't exist)

Batch Operations:

getUserStates() - Retrieve all skill states for a user with computed SEs (returns empty array if user has no skills)
getItemStates() - Retrieve all instances of an item across skills (returns empty array if item doesn't exist)

Key Options

Updates: thetaLR, aLR, bLR, minA, maxA (defaults ~0.05 / 0.001 / 0.01 / 0.2 / 3.0).
Selection: minGapMs, difficultyPenaltyWidth, excludeItemIds, maxTimesSeen, topKRandomize, explorationChance, now.

Learning-rate guidance

thetaLR typically lives in [0.01, 0.10] – smaller for noisy scores, larger for cold start.
bLR works well in [0.001, 0.05] – keep it below thetaLR to avoid thrashing difficulty.
aLR is usually tiny, [1e-4, 5e-3], because discrimination should drift slowly.
Clamp a via minA/maxA (defaults 0.2/3.0) to prevent pathological slopes.
Document any overrides near the code that configures createStreamingIrtEngine so operators share context.

Host Responsibilities & Concurrency

Persist user state (theta, infoSum, heuristic se) and item params (a, b, lastSeenAt, timesSeen); the library stays pure.
Repos must be concurrency-safe (atomic DB updates, transactions, or optimistic locking) to avoid lost updates for the same user/item.
Log responses if you need analytics or offline replays; the library does not handle storage, HTTP, or background jobs.

Development Notes

Entry point re-exports both core and engine.
Tests should cover logistic extremes, update gradients/clamping, info/SE behavior, and selector options (recency, exposure caps, top-K randomness, exploration).
Workspace commands (run from repo root): npm run build, npm test, npm run lint, npm run format. The synthetic plot script can be invoked via npm run --workspace @more-ink/irt-core plot:synthetic.

Synthetic temporal test harness

A Vitest suite (packages/irt-core/tests/syntheticTemporal.test.ts) generates a fresh, deterministic temporal dataset every run: 100 users x 20 skills (each user has all skills), 5 items per skill (100 total), and ~10k responses (one pass of each user/skill over that skill’s items).
RNG is seeded (20240601) for reproducibility; per-user skill thetas are nested in the saved fixture. Scores are continuous in [0,1] (probability plus bounded noise and occasional partial credit). Progress and reasoning logs describe why the test passes (bounded theta jumps, SE shrinkage, ability correlation).
The generated dataset is written to packages/irt-core/tests/fixtures/synthetic-run.json for inspection and is gitignored. Regenerate by re-running npm test.
To visualize the synthetic run, replay and emit an HTML dashboard via npm run --workspace @more-ink/irt-core plot:synthetic; open packages/irt-core/tests/fixtures/synthetic-run.html to see sampled theta trajectories and the true-vs-estimated ability scatter.