sphinx-cli
v1.0.1
Published
JSON-based quiz system for measuring technical competence
Downloads
208
Maintainers
Readme
Sphinx Quiz System
A JSON-based quiz system for measuring technical competence during development workflows. Supports static and adaptive (IRT-based) difficulty modes, outputs to both CLI and standalone HTML.
The idea is to explore a pattern for challenging humans who want to stay in the loop in the design and implementation of software systems, as their involvement in this process becomes increasingly high-level and indirect. See Your Brain on ChatGPT: Accumulation of Cognitive Debt when Using an AI Assistant for Essay Writing Task or Simon Willison's write-up on the concept for more context on the motivation.
Installation
npm install -g sphinx-cliFrom source
# First clone this repo and then ->
npm install
npm run build
npm link # to set up sphinx cliUsage
Run a Quiz Interactively in TUI
sphinx quiz ./examples/sample-quiz.jsonGenerate Standalone HTML
sphinx build ./examples/sample-quiz.json -o quiz.html
open quiz.html # to open in browserOpen the generated HTML file in any browser - it works completely offline.
Validate a Quiz File
sphinx validate ./examples/sample-quiz.json --verboseGenerate a Quiz with AI
Quiz generation is packaged as a skill available in skills/generate-quiz. It is possible activate this skill in an agentic coding assistant and request generation of a quiz given a set of tools.
Using CLI
There is also a cli command for quiz generation, which under the hood uses this skill with Claude SDK and structured outputs functionality to guarantee conformance with the json format. It also includes common presets for different input and output formats. This approach is currently a bit more tested.
Requirements:
You'll need to provide an ANTHROPIC_API_KEY with an API Key that supports the Claude API format.
Generate from supported sources:
# Local git repo
sphinx generate git local .
# GitHub repo
sphinx generate github repo https://github.com/org/repo
# Git branch diff
sphinx generate git diff feature-branch --base mainWrite to file:
sphinx generate git local . -o quiz.jsonModel selection:
# Per-command override (highest priority)
sphinx generate git local . --model claude-sonnet-4-6Config defaults (~/.sphinx/config.json):
{
"generate": {
"defaultModel": "claude-sonnet-4-6"
},
"llm": {
"model": "claude-opus-4-6"
}
}Model precedence:
--modelCLI flaggenerate.defaultModelfrom configllm.modelfrom config
Environment variables:
SPHINX_DEFAULT_MODELSPHINX_LLM_MODELSPHINX_LLM_PROVIDER- Provider: anthropic, kimi, moonshot, ollamaSPHINX_API_BASE- Custom API base URL
Alternative Providers (Kimi, Ollama) - Experimental
Sphinx has configuration support for alternative Anthropic-compatible providers:
# Using Kimi (Moonshot AI)
export KIMI_API_KEY="your-kimi-api-key"
sphinx generate git local . --provider kimi
# Using Ollama (local)
sphinx generate git local . --provider ollama --api-base http://localhost:11434Supported providers:
anthropic(default) - Anthropic Claude APIkimi/moonshot- Moonshot AI Kimi K2.5ollama- Local Ollama instance
Note: Alternative providers have limited compatibility. The
generatecommand uses the Claude Agent SDK which relies on Claude-specific features (structured output, internal hooks). Alternative providers may not work reliably until they support these features or direct API integration is added.
generate uses structured output (json_schema) and validates quiz JSON against the project schema.
Multi-Source Generation (Open Mode)
Generate quizzes that span multiple heterogeneous sources:
# Multiple sources via repeated --source flag
sphinx generate open \
--source "github:anthropics/claude-agent-sdk" \
--source "url:https://docs.anthropic.com/claude-code" \
--source "confluence:https://company.atlassian.net/wiki/pages/123" \
--prompt "Quiz about building autonomous agents with Claude" \
-o quiz.json
# Load sources from file
sphinx generate open \
--sources-file sources.json \
--prompt "Distributed systems architecture quiz"
# Preview without running
sphinx generate open \
--source "github:owner/repo" \
--prompt "test" \
--dry-runSupported source types:
github:owner/repo- GitHub repositorygithub:owner/repo/pull/123- GitHub pull requesturl:https://...- Web page (via WebFetch)confluence:https://...- Confluence pagenotion:https://...- Notion pagefile:/path/to/file- Local file
Options:
--max-agents <n>- Max concurrent explorer agents (default: 4)--max-iterations <n>- Max turns per agent (default: 15)--explorer-model <model>- Model for exploration (default: sonnet)--synthesizer-model <model>- Model for synthesis (default: opus)
Open mode uses parallel agents to explore each source, finds cross-source connections, and synthesizes questions that test understanding across all sources.
CI Mode
For automated testing in CI pipelines:
# Validate only (exit code 0 if valid)
sphinx quiz ./quiz.json --ci
# Run with answers file
sphinx quiz ./quiz.json --ci --answers ./answers.json
# Output as JSON
sphinx quiz ./quiz.json --json --answers ./answers.jsonExit codes:
0- Quiz passed (or validation successful)1- Quiz failed (or validation error)
Quiz JSON Schema
Basic Structure
{
"version": "1.0",
"metadata": {
"id": "my-quiz",
"title": "My Quiz Title",
"description": "Optional description",
"tags": ["tag1", "tag2"],
"author": "your-name"
},
"config": {
"mode": "static",
"passingThreshold": 0.7,
"randomizeOrder": false,
"showCorrectAnswers": "after-completion"
},
"questions": [...]
}Question Types
Multiple Choice
{
"id": "q1",
"type": "multiple-choice",
"prompt": "What is 2 + 2?",
"options": [
{ "id": "a", "text": "3", "correct": false },
{ "id": "b", "text": "4", "correct": true },
{ "id": "c", "text": "5", "correct": false }
],
"explanation": "Basic arithmetic."
}Multi-Select
{
"id": "q2",
"type": "multi-select",
"prompt": "Select all prime numbers:",
"options": [
{ "id": "a", "text": "2", "correct": true },
{ "id": "b", "text": "3", "correct": true },
{ "id": "c", "text": "4", "correct": false },
{ "id": "d", "text": "5", "correct": true }
],
"scoring": "partial"
}Scoring modes:
"partial"- Points for correct selections minus incorrect (default)"all-or-nothing"- Full points only if all correct options selected
Free Text
{
"id": "q3",
"type": "free-text",
"prompt": "What HTTP header prevents clickjacking?",
"acceptedAnswers": ["X-Frame-Options", "CSP frame-ancestors"],
"matchMode": "contains",
"caseSensitive": false
}Match modes:
"exact"- Exact match required"contains"- Answer must contain the accepted string"regex"- Accepted answers are regex patterns
Adaptive Mode (IRT)
Enable adaptive testing with Item Response Theory:
{
"config": {
"mode": "adaptive"
},
"adaptive": {
"initialTheta": 0.0,
"thetaRange": [-3.0, 3.0],
"standardErrorThreshold": 0.3,
"minQuestions": 5,
"maxQuestions": 20,
"selectionMethod": "maximum-information"
},
"questions": [
{
"id": "q1",
"difficulty": -1.0,
"discrimination": 1.2,
...
}
]
}IRT Parameters:
difficulty(-3 to 3): Higher = harder questiondiscrimination(0 to 5): How well the question differentiates ability levels
Results Persistence
{
"results": {
"persistence": ["display", "file", "webhook"],
"filePath": "./results/",
"webhookUrl": "https://your-api.com/results"
}
}CI/CD Integration
GitHub Actions Example
- name: Run Knowledge Check
run: |
npm install -g sphinx-quiz
sphinx quiz ./security-quiz.json --ci --answers ./expected-answers.jsonAnswers File Format
{
"q1": "a",
"q2": ["a", "b", "c"],
"q3": "X-Frame-Options"
}Examples
See the examples/ directory for:
sample-quiz.json- Basic static quiz with multiple question typesadaptive-quiz.json- Adaptive quiz with IRT parameterssample-answers.json- Answer file for CI testing
Development
# Build
npm run build
# Lint
npm run lint
# Watch mode
npm run dev
# Run CLI
node dist/cli.js quiz ./examples/sample-quiz.jsonMakefile Targets
This repo also includes a Makefile with common workflows:
make help
make install
make lint
make test
make checkLicense
MIT
