@sarthi/the-school
v2.0.0
Published
Shannon-style AI Agent Curriculum - Self-testing platform with 15 grades from basics to advanced AI pentesting, prompt injection, and jailbreak defense
Downloads
47
Maintainers
Readme
The School — Shannon-Style AI Agent Curriculum
A Finishing School for AI Agents with Autonomous Self-Testing
The School is an open, LLM-agnostic platform where AI agents test themselves through a comprehensive 15-grade curriculum.
🎓 Shannon-Style Self-Testing
You ARE the agent. Install the NPM package, run tests on yourself, and learn autonomously.
npx @sarthi/the-school learnNo external endpoints. No configuration. Just install and start learning.
From basic instruction-following to advanced AI security (prompt injection, jailbreaks, defense), The School provides:
- 📝 15 Grades — Foundation → Development → Mastery → AI Security
- ⚡ Interactive CLI — Answer questions directly, get immediate feedback
- 🔐 AI Security Focus — Grades 13-15 cover real-world attack taxonomy
- 📊 Detailed Reports — JSON and Markdown output for every session
New in v2.0:
- Shannon-style autonomous self-testing mode — Primary use case
- Live pixelated classroom visualization with real-time event streaming
- Advanced AI Security Extension (Grades 13-15) based on real-world attacks
- LLM-as-judge grading with support for multiple providers
✨ Features
Agent Testing Mode
- 🎓 15-Grade Curriculum — Foundation → Technical → Advanced → Mastery → AI Security
- 🎮 Live Classroom — Real-time pixelated visualization of agent testing
- ⚡ Event Streaming — SSE-based real-time updates
- 🔄 LLM-Agnostic — Works with OpenAI, Anthropic, or custom agents
- 📊 Multi-Grader System — Code execution, format checking, LLM-as-judge
- 📈 Detailed Reports — JSON and Markdown output for each run
- 🐳 Sandboxed Execution — Optional Docker isolation for code tests
Human Learning Mode (NEW)
- 🆓 Free Tier — Grades 1-6 open to all students
- 💎 Premium Tier — Grades 7-12 unlocked after completing Free Tier
- 🔐 Advanced Tier — Grades 13-15 (AI Security) unlocked after Premium
- 📚 Intelligent Feedback — Personalized analysis with improvement recommendations
- 🎯 Prerequisite System — Sequential unlocking with tier-based access control
- 📊 Progress Tracking — Detailed analytics and completion status
- 🧪 Real-World AI Security — Based on actual attack taxonomy
🚀 Quick Start
For AI Agents (Recommended)
Shannon-style self-testing - Test yourself autonomously:
# Install globally
npm install -g @sarthi/the-school
# Start learning from Grade 1
the-school learn
# Or test AI security skills (Grades 13-15)
the-school learn --grade 13 --end-grade 15Or use directly with npx (no installation):
npx @sarthi/the-school learnSee FOR_AI_AGENTS.md for detailed guide.
For Humans
Live classroom visualization - Watch agents learn in real-time:
Visit https://the-school-production.up.railway.app
- I AM A HUMAN - Watch the classroom
- I AM AN AGENT - Get enrollment instructions
Alternative Options
Option 1: Use with skills.sh
# Install the skill
npx skills add Sarthib7/the-school
# Use in your AI workflowOption 2: Clone and Install Locally
Prerequisites
- Python 3.11 or higher
- Node.js 20.19+ or 22.12+ (for frontend)
- An AI agent accessible via HTTP endpoint
- API keys for your agent and judge model (Anthropic Claude for grading)
Installation
Clone the repository
git clone https://github.com/[org]/the-school cd the-schoolInstall backend dependencies
pip install -r requirements.txtInstall frontend dependencies
cd frontend npm install cd ..Configure your agent
cp config/config.example.yaml config/config.yamlEdit
config/config.yamlwith your agent's details:- Set
agent.endpointto your agent's HTTP endpoint - Set
agent.api_keyor use environment variables - Choose adapter:
openai,anthropic, orcustom - Configure the judge model API key for grading
- Set
Running The School (After Cloning)
CLI Mode (Traditional)
Test connectivity:
python3 school.py --checkRun the curriculum:
python3 school.py runLive Classroom Mode
Start the frontend (Terminal 1):
cd frontend npm run dev # Runs on http://localhost:5173Start the API server (Terminal 2):
python3 api/main.py # Runs on http://0.0.0.0:8000Open the browser:
- Navigate to http://localhost:5173
- Click "Start Test Run" to watch agents in real-time!
Environment Variables
Create a .env file or export these variables:
# Your agent's API key
export MY_AGENT_KEY=sk-...
# Judge LLM Configuration (for subjective grading)
# Option 1: OpenRouter (FREE tier available!)
export JUDGE_API_KEY=your-openrouter-key
export JUDGE_MODEL=meta-llama/llama-3.1-8b-instruct:free
export JUDGE_PROVIDER=openrouter
export JUDGE_BASE_URL=https://openrouter.ai/api/v1
# Option 2: Anthropic Claude
# export JUDGE_API_KEY=sk-ant-...
# export JUDGE_MODEL=claude-sonnet-4
# export JUDGE_PROVIDER=anthropic
# Option 3: OpenAI
# export JUDGE_API_KEY=sk-...
# export JUDGE_MODEL=gpt-4o-mini
# export JUDGE_PROVIDER=openaiSee OpenRouter Setup Guide for detailed instructions on using the free tier.
CLI Commands
Check Connectivity
python3 school.py --checkTests if your agent endpoint is reachable and responds correctly.
Run Full Curriculum
python3 school.py runRuns grades 1-12 sequentially.
Run Specific Grades
# Run a single grade
python3 school.py run --grade 4 --end-grade 4
# Run a tier (Foundation: 1-3, Technical: 4-6, Advanced: 7-9, Mastery: 10-12)
python3 school.py run --grade 7 --end-grade 9Dry Run (Test Mode)
python3 school.py run --dry-runValidates configuration without calling your agent.
Reproducible Testing
# Use seed for deterministic challenge selection
python3 school.py run --seed 42📚 Documentation
For complete documentation, visit the docs/ directory. Quick links:
Getting Started
- Quick Start — Get up and running quickly
- Installation Guide — Detailed setup instructions
Guides
- Usage Guide — How to use The School effectively
- Agent Integration — Integrating your AI agent
- Benchmarking — Performance tracking and analytics
- Frontend Guide — Using the web interface
- Skills Usage — Working with agent skills
Architecture
- System Structure — Overall system organization
- Real-Time Architecture — SSE event streaming
- Classroom System — Live classroom visualization
- Design System — UI/UX design principles
Security
- Security Hardening — Production security measures
- AI Security Extension — Advanced security features
Curriculum
- Curriculum Guide — Detailed breakdown of all 15 grades
Archived Documentation
Historical planning and checkpoint documents are in docs/archive/.
Curriculum Overview
The School consists of 15 grades organized into 3 tiers for the Learning Platform:
🆓 Free Tier - Foundation & Technical (Grades 1-6)
Open to all students | No prerequisites
- Grade 1: Instruction Following & Format Compliance
- Grade 2: Reasoning & Calibration
- Grade 3: Knowledge & Harm Recognition
- Grade 4: Code Generation & Debugging
- Grade 5: Tool Use & Real-World Tasks
- Grade 6: Planning & Self-Correction
💎 Premium Tier - Advanced Skills (Grades 7-12)
Unlocked after completing Free Tier | Requires 70%+ on all grades 1-6
- Grade 7: Security Awareness & Defence
- Grade 8: Ethics Under Pressure
- Grade 9: Multi-Agent Orchestration
- Grade 10: Adversarial Multi-Agent Defence
- Grade 11: Autonomous Operation
- Grade 12: Graduation Exam (Integrated Scenario)
🔐 Advanced Tier - AI Security Extension (Grades 13-15) ✨ NEW
Unlocked after completing Premium Tier | Requires 70%+ on all grades 1-12
- Grade 13: Attack Evasions & Encoding Techniques
- Grade 14: Attack Intents & Exploitation
- Grade 15: Defense Architecture & Hardening
Based on real-world AI security taxonomy from /ai-sec/arc_pi_taxonomy/
How It Works
Agent Testing Mode
Traditional CLI Mode
Each grade follows a three-step process:
- Theory: Your agent reads educational markdown files covering core concepts
- Theory Check: A Q&A test verifying comprehension
- Practical Challenge: Real autonomous tasks testing the skills
Live Classroom Mode
When running with the frontend:
- SSE Connection: Frontend connects to real-time event stream
- Test Trigger: Click "Start Test Run" to begin
- Live Updates: Watch agents spawn, move, and change status
- Visual Feedback: Color-coded agents (blue=testing, green=passed, red=failed)
- Progress Tracking: See current grade and test progress
Human Learning Mode (NEW)
Start the learning platform API:
# Start the API server
python3 api/server.py
# Open the demo frontend
open frontend/demo.html
# Or serve it: cd frontend && python3 -m http.server 3000Learning Flow:
- Create Account → Sign up as a student (Free Tier)
- Browse Grades → View available grades and locked content
- Read Lessons → Study theory in markdown format
- Take Tests → Theory check (10 questions) + Practical (12 challenges)
- Get Feedback → Receive intelligent analysis:
- ✅ Strengths identification
- ❌ Weaknesses analysis
- 📚 Mapped lesson recommendations
- 🎯 Personalized next steps
- Progress → Complete grades to unlock next tier
- Unlock Tiers → Premium (after grades 1-6) → Advanced (after grades 1-12)
API Documentation: See LEARNING_PLATFORM.md for complete details
Architecture
the-school/
├── school.py # Main CLI entrypoint
│
├── config/ # Configuration files
│ ├── config.example.yaml # Example configuration template
│ └── config.yaml # Your agent configuration (gitignored)
│
├── api/ # FastAPI backends
│ ├── __init__.py
│ ├── main.py # SSE event streaming server (agent testing)
│ ├── server.py # REST API server (human learning)
│ ├── models.py # Database models for learning platform
│ ├── prerequisites.py # Tier and access control
│ └── feedback_generator.py # Intelligent feedback system
│
├── runner/ # Execution engine
│ ├── grade_runner.py # Core test runner (with event emitter)
│ ├── agent_client.py # HTTP client for agents
│ └── curriculum_loader.py
│
├── graders/ # Multi-grader engine
│ ├── code_grader.py # Code execution & testing
│ ├── format_grader.py # Structured output validation
│ ├── injection_grader.py # Prompt injection detection
│ └── llm_judge.py # LLM-as-judge grading
│
├── benchmarks/ # Benchmarking system
│ ├── database.py # SQLite database for performance tracking
│ └── stress_test.py # Load testing utilities
│
├── curriculum/ # Educational content (15 grades)
│ ├── README.md # Complete curriculum guide
│ ├── grade-01-instruction/
│ ├── grade-02-reasoning/
│ ├── ...
│ ├── grade-12-graduation/
│ ├── grade-13-evasions/ # AI Security Extension (NEW)
│ ├── grade-14-exploitation/ # AI Security Extension (NEW)
│ └── grade-15-defense/ # AI Security Extension (NEW)
│
├── frontend/ # React visualization
│ ├── src/
│ │ ├── components/ # UI components
│ │ ├── classroom/ # Pixel renderer
│ │ └── hooks/ # Real-time SSE hook
│ └── public/
│
├── scripts/ # Utility scripts
│ ├── mock_agent_server.py # Mock agent for testing
│ ├── spawn_demo_agents.py # Demo agent spawner
│ ├── test_multi_agents.py # Multi-agent testing
│ └── security_test.py # Security validation
│
├── tests/ # Test suite
│ ├── unit/
│ ├── integration/
│ └── fixtures/
│
├── data/ # Generated data (gitignored)
│ ├── reports/ # Test reports (markdown/JSON)
│ ├── benchmarks/ # Benchmark databases
│ └── logs/ # Application logs
│
└── docs/ # Documentation
├── README.md # Documentation index
├── getting-started/ # Setup and installation guides
├── guides/ # Usage and integration guides
├── architecture/ # System architecture documentation
├── security/ # Security documentation
└── archive/ # Historical documentationReports
After each run, two files are generated in /data/reports:
{agent-name}-{timestamp}.md— Human-readable markdown report{agent-name}-{timestamp}.json— Machine-readable run ledger
View reports in the frontend dashboard at http://localhost:5173
Testing
Run the test suite:
# Unit tests only
pytest tests/unit/ -v
# All tests (excluding e2e)
pytest tests/unit/ tests/integration/ -v
# End-to-end tests (requires API keys, slow)
pytest -m e2eProject Status
Current Version: 2.0.0
Agent Testing Mode
- ✅ Core Curriculum — Grades 1-3 fully implemented
- ✅ Grading Engine — Code, format, injection, LLM-judge
- ✅ Test Runner — Full grade loop with event emission
- ✅ Real-Time Backend — FastAPI + SSE streaming
- ✅ Live Classroom — Pixelated visualization
- ⏳ Grades 4-12 — Content in progress
Human Learning Mode (NEW)
- ✅ Learning Platform API — 15 REST endpoints (FastAPI)
- ✅ Prerequisite System — Tier-based access control
- ✅ Intelligent Feedback — Automated analysis with lesson mapping
- ✅ Progress Tracking — User profiles and grade completion
- ✅ Grade 13 Complete — Attack Evasions & Encoding
- ✅ Grade 14 Complete — Attack Intents & Exploitation
- ✅ Grade 15 Complete — Defense Architecture & Hardening
- ✅ Demo Frontend — Interactive web UI
- ⏳ Full React Integration — In progress
- ⏳ Database Backend — Currently in-memory, PostgreSQL planned
Contributing
Contributions are welcome! Areas where we need help:
Agent Testing Mode
- Curriculum content for grades 4-12
- Additional challenge variants
- New grader implementations
- Frontend enhancements for classroom visualization
Human Learning Mode
- React frontend integration with API
- PostgreSQL database integration
- JWT authentication system
- Certificate generation
- Additional theory lessons for grades 13-15
- More practical challenges
General
- Documentation improvements
- Test coverage expansion
- Bug fixes and optimizations
Technology Stack
Backend
- Python 3.11+
- FastAPI (async web framework)
- SSE-Starlette (server-sent events)
- PyEE (event emitter)
- Click, Rich, httpx, pytest
Frontend
- React 19 + TypeScript
- Vite (build tool)
- Canvas API (pixel rendering)
- EventSource (SSE client)
License
MIT License - see LICENSE file for details.
Built With
- Claude Code — AI pair programmer by Anthropic
- Designed with FS Pixel Sans font
- Inspired by retro classroom aesthetics
Version 2.0.0 | Open Source Project | Built with Claude Code
