@kemeny/interview-game
v0.2.2
Published
Client and CLI for the Kemeny Studio evaluation sandbox.
Downloads
159
Readme
Kemeny Interview Game
A 2D puzzle sandbox. Each turn your agent receives a grid and decides one action. Build an agent that explores the world and figures out how to advance through the levels.
The game runs on a remote server. Your agent is a small JavaScript file you write locally and run via the CLI; it talks to the server one turn at a time.
Setup
1. Install the package
npm install @kemeny/interview-game2. Configure your API key
You should have received an API key from us. Set it in your environment:
export KEMENY_API_KEY=your_key_here(If you'd rather pass it inline, prefix the command: KEMENY_API_KEY=... npx kemeny-run-agent ....)
3. Create your agent
// my-agent.js
function chooseAction(observation) {
// Decide what to do this turn.
// Return one of: 'up' | 'down' | 'left' | 'right' | 'reset' | null
return 'right';
}
module.exports = { chooseAction };4. Run your agent
npx kemeny-run-agent --agent my-agent.js --output scorecard.jsonTypical output:
=== Agent runner ===
Agent: my-agent Server: https://interview-api.kemenylabs.com Cap: 80
▶ Level 1 (L1)
✓ WIN in 4 moves (320ms)
▶ Level 2 (L2)
✗ no progress in 9 moves (810ms)
...
=== Summary: 1/4 levels solved (3.2s) ===
Scorecard saved to scorecard.json5. Submit your agent code
When you're satisfied with your agent, send the code to the evaluator:
npx kemeny-submit-agent --agent my-agent.js --notes "short note for the evaluator"This packages your project (excluding node_modules, .git, files over 1 MB and archives) into a zip and uploads it via the same API key. The server stores the zip; it never executes your code.
Useful flags:
--root <dir>— directory to package (default: directory of--agent).--dry-run— list the files that would be sent without uploading.--output <path>— save the zip locally in addition to uploading.
Agent contract
chooseAction(observation) is called once per turn. It can be synchronous or return a Promise (handy if you call out to an external service like an LLM).
You return one of:
'up' | 'down' | 'left' | 'right' | 'reset' | null'reset' restarts the current level (useful if you want to try a different approach). null skips the level (counts as a fail).
The observation looks like:
{
rooms: { [roomId: string]: string[][] }, // 2D grids indexed [y][x]; '.' is empty
moveCount: 0, // 0-indexed turn within the level
availableActions: ['down','left','reset','right','up'],
lastAction: string | null, // your previous action, if any
lastResult: 'moved' | 'no_effect' | 'reset' | null,
}Each rooms[id] is a 2D array of strings indexed [y][x]. Cells with '.' are empty; other cells contain tokens whose meaning you discover by playing.
lastResult is the engine's feedback after each action — useful to confirm whether what you tried had any effect.
Programmatic API
If you'd rather drive the run yourself instead of using the CLI:
const { runAgent } = require('@kemeny/interview-game');
(async () => {
const scorecard = await runAgent(
(observation) => 'right',
{ cap: 80 } // apiKey, apiBase optional — read from env by default
);
console.log(scorecard);
})();