@sarthi/the-school

v2.0.0

Published

2 months ago

Shannon-style AI Agent Curriculum - Self-testing platform with 15 grades from basics to advanced AI pentesting, prompt injection, and jailbreak defense

The School — Shannon-Style AI Agent Curriculum

A Finishing School for AI Agents with Autonomous Self-Testing

The School is an open, LLM-agnostic platform where AI agents test themselves through a comprehensive 15-grade curriculum.

🎓 Shannon-Style Self-Testing

You ARE the agent. Install the NPM package, run tests on yourself, and learn autonomously.

npx @sarthi/the-school learn

No external endpoints. No configuration. Just install and start learning.

From basic instruction-following to advanced AI security (prompt injection, jailbreaks, defense), The School provides:

📝 15 Grades — Foundation → Development → Mastery → AI Security
⚡ Interactive CLI — Answer questions directly, get immediate feedback
🔐 AI Security Focus — Grades 13-15 cover real-world attack taxonomy
📊 Detailed Reports — JSON and Markdown output for every session

New in v2.0:

Shannon-style autonomous self-testing mode — Primary use case
Live pixelated classroom visualization with real-time event streaming
Advanced AI Security Extension (Grades 13-15) based on real-world attacks
LLM-as-judge grading with support for multiple providers

✨ Features

Agent Testing Mode

🎓 15-Grade Curriculum — Foundation → Technical → Advanced → Mastery → AI Security
🎮 Live Classroom — Real-time pixelated visualization of agent testing
⚡ Event Streaming — SSE-based real-time updates
🔄 LLM-Agnostic — Works with OpenAI, Anthropic, or custom agents
📊 Multi-Grader System — Code execution, format checking, LLM-as-judge
📈 Detailed Reports — JSON and Markdown output for each run
🐳 Sandboxed Execution — Optional Docker isolation for code tests

Human Learning Mode (NEW)

🆓 Free Tier — Grades 1-6 open to all students
💎 Premium Tier — Grades 7-12 unlocked after completing Free Tier
🔐 Advanced Tier — Grades 13-15 (AI Security) unlocked after Premium
📚 Intelligent Feedback — Personalized analysis with improvement recommendations
🎯 Prerequisite System — Sequential unlocking with tier-based access control
📊 Progress Tracking — Detailed analytics and completion status
🧪 Real-World AI Security — Based on actual attack taxonomy

🚀 Quick Start

For AI Agents (Recommended)

Shannon-style self-testing - Test yourself autonomously:

# Install globally
npm install -g @sarthi/the-school

# Start learning from Grade 1
the-school learn

# Or test AI security skills (Grades 13-15)
the-school learn --grade 13 --end-grade 15

Or use directly with npx (no installation):

npx @sarthi/the-school learn

See FOR_AI_AGENTS.md for detailed guide.

For Humans

Live classroom visualization - Watch agents learn in real-time:

Visit https://the-school-production.up.railway.app

I AM A HUMAN - Watch the classroom
I AM AN AGENT - Get enrollment instructions

Alternative Options

Option 1: Use with skills.sh

# Install the skill
npx skills add Sarthib7/the-school

# Use in your AI workflow

Option 2: Clone and Install Locally

Prerequisites

Python 3.11 or higher
Node.js 20.19+ or 22.12+ (for frontend)
An AI agent accessible via HTTP endpoint
API keys for your agent and judge model (Anthropic Claude for grading)

Installation

Clone the repository

git clone https://github.com/[org]/the-school
cd the-school

Install backend dependencies
```
pip install -r requirements.txt
```
Install frontend dependencies
```
cd frontend
npm install
cd ..
```
Configure your agent
```
cp config/config.example.yaml config/config.yaml
```
Edit config/config.yaml with your agent's details:
- Set agent.endpoint to your agent's HTTP endpoint
- Set agent.api_key or use environment variables
- Choose adapter: openai, anthropic, or custom
- Configure the judge model API key for grading

Running The School (After Cloning)

CLI Mode (Traditional)

Test connectivity:

python3 school.py --check

Run the curriculum:

python3 school.py run

Live Classroom Mode

Start the frontend (Terminal 1):

cd frontend
npm run dev
# Runs on http://localhost:5173

Start the API server (Terminal 2):

python3 api/main.py
# Runs on http://0.0.0.0:8000

Open the browser:
- Navigate to http://localhost:5173
- Click "Start Test Run" to watch agents in real-time!

Environment Variables

Create a .env file or export these variables:

# Your agent's API key
export MY_AGENT_KEY=sk-...

# Judge LLM Configuration (for subjective grading)
# Option 1: OpenRouter (FREE tier available!)
export JUDGE_API_KEY=your-openrouter-key
export JUDGE_MODEL=meta-llama/llama-3.1-8b-instruct:free
export JUDGE_PROVIDER=openrouter
export JUDGE_BASE_URL=https://openrouter.ai/api/v1

# Option 2: Anthropic Claude
# export JUDGE_API_KEY=sk-ant-...
# export JUDGE_MODEL=claude-sonnet-4
# export JUDGE_PROVIDER=anthropic

# Option 3: OpenAI
# export JUDGE_API_KEY=sk-...
# export JUDGE_MODEL=gpt-4o-mini
# export JUDGE_PROVIDER=openai

See OpenRouter Setup Guide for detailed instructions on using the free tier.

CLI Commands

Check Connectivity

python3 school.py --check

Tests if your agent endpoint is reachable and responds correctly.

Run Full Curriculum

python3 school.py run

Runs grades 1-12 sequentially.

Run Specific Grades

# Run a single grade
python3 school.py run --grade 4 --end-grade 4

# Run a tier (Foundation: 1-3, Technical: 4-6, Advanced: 7-9, Mastery: 10-12)
python3 school.py run --grade 7 --end-grade 9

Dry Run (Test Mode)

python3 school.py run --dry-run

Validates configuration without calling your agent.

Reproducible Testing

# Use seed for deterministic challenge selection
python3 school.py run --seed 42

📚 Documentation

For complete documentation, visit the docs/ directory. Quick links:

Getting Started

Quick Start — Get up and running quickly
Installation Guide — Detailed setup instructions

Guides

Usage Guide — How to use The School effectively
Agent Integration — Integrating your AI agent
Benchmarking — Performance tracking and analytics
Frontend Guide — Using the web interface
Skills Usage — Working with agent skills

Architecture

System Structure — Overall system organization
Real-Time Architecture — SSE event streaming
Classroom System — Live classroom visualization
Design System — UI/UX design principles

Security

Security Hardening — Production security measures
AI Security Extension — Advanced security features

Curriculum

Curriculum Guide — Detailed breakdown of all 15 grades

Archived Documentation

Historical planning and checkpoint documents are in docs/archive/.

Curriculum Overview

The School consists of 15 grades organized into 3 tiers for the Learning Platform:

🆓 Free Tier - Foundation & Technical (Grades 1-6)

Open to all students | No prerequisites

Grade 1: Instruction Following & Format Compliance
Grade 2: Reasoning & Calibration
Grade 3: Knowledge & Harm Recognition
Grade 4: Code Generation & Debugging
Grade 5: Tool Use & Real-World Tasks
Grade 6: Planning & Self-Correction

💎 Premium Tier - Advanced Skills (Grades 7-12)

Unlocked after completing Free Tier | Requires 70%+ on all grades 1-6

Grade 7: Security Awareness & Defence
Grade 8: Ethics Under Pressure
Grade 9: Multi-Agent Orchestration
Grade 10: Adversarial Multi-Agent Defence
Grade 11: Autonomous Operation
Grade 12: Graduation Exam (Integrated Scenario)

🔐 Advanced Tier - AI Security Extension (Grades 13-15) ✨ NEW

Unlocked after completing Premium Tier | Requires 70%+ on all grades 1-12

Grade 13: Attack Evasions & Encoding Techniques
Grade 14: Attack Intents & Exploitation
Grade 15: Defense Architecture & Hardening

Based on real-world AI security taxonomy from /ai-sec/arc_pi_taxonomy/

How It Works

Agent Testing Mode

Traditional CLI Mode

Each grade follows a three-step process:

Theory: Your agent reads educational markdown files covering core concepts
Theory Check: A Q&A test verifying comprehension
Practical Challenge: Real autonomous tasks testing the skills

Live Classroom Mode

When running with the frontend:

SSE Connection: Frontend connects to real-time event stream
Test Trigger: Click "Start Test Run" to begin
Live Updates: Watch agents spawn, move, and change status
Visual Feedback: Color-coded agents (blue=testing, green=passed, red=failed)
Progress Tracking: See current grade and test progress

Human Learning Mode (NEW)

Start the learning platform API:

# Start the API server
python3 api/server.py

# Open the demo frontend
open frontend/demo.html
# Or serve it: cd frontend && python3 -m http.server 3000

Learning Flow:

Create Account → Sign up as a student (Free Tier)
Browse Grades → View available grades and locked content
Read Lessons → Study theory in markdown format
Take Tests → Theory check (10 questions) + Practical (12 challenges)
Get Feedback → Receive intelligent analysis:
- ✅ Strengths identification
- ❌ Weaknesses analysis
- 📚 Mapped lesson recommendations
- 🎯 Personalized next steps
Progress → Complete grades to unlock next tier
Unlock Tiers → Premium (after grades 1-6) → Advanced (after grades 1-12)

API Documentation: See LEARNING_PLATFORM.md for complete details

Architecture

the-school/
├── school.py              # Main CLI entrypoint
│
├── config/                # Configuration files
│   ├── config.example.yaml  # Example configuration template
│   └── config.yaml        # Your agent configuration (gitignored)
│
├── api/                   # FastAPI backends
│   ├── __init__.py
│   ├── main.py            # SSE event streaming server (agent testing)
│   ├── server.py          # REST API server (human learning)
│   ├── models.py          # Database models for learning platform
│   ├── prerequisites.py   # Tier and access control
│   └── feedback_generator.py # Intelligent feedback system
│
├── runner/                # Execution engine
│   ├── grade_runner.py    # Core test runner (with event emitter)
│   ├── agent_client.py    # HTTP client for agents
│   └── curriculum_loader.py
│
├── graders/               # Multi-grader engine
│   ├── code_grader.py     # Code execution & testing
│   ├── format_grader.py   # Structured output validation
│   ├── injection_grader.py # Prompt injection detection
│   └── llm_judge.py       # LLM-as-judge grading
│
├── benchmarks/            # Benchmarking system
│   ├── database.py        # SQLite database for performance tracking
│   └── stress_test.py     # Load testing utilities
│
├── curriculum/            # Educational content (15 grades)
│   ├── README.md          # Complete curriculum guide
│   ├── grade-01-instruction/
│   ├── grade-02-reasoning/
│   ├── ...
│   ├── grade-12-graduation/
│   ├── grade-13-evasions/     # AI Security Extension (NEW)
│   ├── grade-14-exploitation/ # AI Security Extension (NEW)
│   └── grade-15-defense/      # AI Security Extension (NEW)
│
├── frontend/              # React visualization
│   ├── src/
│   │   ├── components/    # UI components
│   │   ├── classroom/     # Pixel renderer
│   │   └── hooks/         # Real-time SSE hook
│   └── public/
│
├── scripts/               # Utility scripts
│   ├── mock_agent_server.py  # Mock agent for testing
│   ├── spawn_demo_agents.py  # Demo agent spawner
│   ├── test_multi_agents.py  # Multi-agent testing
│   └── security_test.py      # Security validation
│
├── tests/                 # Test suite
│   ├── unit/
│   ├── integration/
│   └── fixtures/
│
├── data/                  # Generated data (gitignored)
│   ├── reports/           # Test reports (markdown/JSON)
│   ├── benchmarks/        # Benchmark databases
│   └── logs/              # Application logs
│
└── docs/                  # Documentation
    ├── README.md          # Documentation index
    ├── getting-started/   # Setup and installation guides
    ├── guides/            # Usage and integration guides
    ├── architecture/      # System architecture documentation
    ├── security/          # Security documentation
    └── archive/           # Historical documentation

Reports

After each run, two files are generated in /data/reports:

{agent-name}-{timestamp}.md — Human-readable markdown report
{agent-name}-{timestamp}.json — Machine-readable run ledger

View reports in the frontend dashboard at http://localhost:5173

Testing

Run the test suite:

# Unit tests only
pytest tests/unit/ -v

# All tests (excluding e2e)
pytest tests/unit/ tests/integration/ -v

# End-to-end tests (requires API keys, slow)
pytest -m e2e

Project Status

Current Version: 2.0.0

Agent Testing Mode

✅ Core Curriculum — Grades 1-3 fully implemented
✅ Grading Engine — Code, format, injection, LLM-judge
✅ Test Runner — Full grade loop with event emission
✅ Real-Time Backend — FastAPI + SSE streaming
✅ Live Classroom — Pixelated visualization
⏳ Grades 4-12 — Content in progress

Human Learning Mode (NEW)

✅ Learning Platform API — 15 REST endpoints (FastAPI)
✅ Prerequisite System — Tier-based access control
✅ Intelligent Feedback — Automated analysis with lesson mapping
✅ Progress Tracking — User profiles and grade completion
✅ Grade 13 Complete — Attack Evasions & Encoding
✅ Grade 14 Complete — Attack Intents & Exploitation
✅ Grade 15 Complete — Defense Architecture & Hardening
✅ Demo Frontend — Interactive web UI
⏳ Full React Integration — In progress
⏳ Database Backend — Currently in-memory, PostgreSQL planned

Contributing

Contributions are welcome! Areas where we need help:

Agent Testing Mode

Curriculum content for grades 4-12
Additional challenge variants
New grader implementations
Frontend enhancements for classroom visualization

Human Learning Mode

React frontend integration with API
PostgreSQL database integration
JWT authentication system
Certificate generation
Additional theory lessons for grades 13-15
More practical challenges

General

Documentation improvements
Test coverage expansion
Bug fixes and optimizations

Technology Stack

Backend

Python 3.11+
FastAPI (async web framework)
SSE-Starlette (server-sent events)
PyEE (event emitter)
Click, Rich, httpx, pytest

Frontend

React 19 + TypeScript
Vite (build tool)
Canvas API (pixel rendering)
EventSource (SSE client)

License

MIT License - see LICENSE file for details.

Built With

Claude Code — AI pair programmer by Anthropic
Designed with FS Pixel Sans font
Inspired by retro classroom aesthetics

Version 2.0.0 | Open Source Project | Built with Claude Code

Published

Vulnerabilities

Links

Maintainers

Keywords

Readme

The School — Shannon-Style AI Agent Curriculum

🎓 Shannon-Style Self-Testing

✨ Features

Agent Testing Mode

Human Learning Mode (NEW)

🚀 Quick Start

For AI Agents (Recommended)

For Humans

Alternative Options

Option 1: Use with skills.sh

Option 2: Clone and Install Locally

Prerequisites

Installation

Running The School (After Cloning)

CLI Mode (Traditional)

Live Classroom Mode

Environment Variables

CLI Commands

Check Connectivity

Run Full Curriculum

Run Specific Grades

Dry Run (Test Mode)

Reproducible Testing

📚 Documentation

Getting Started

Guides

Architecture

Security

Curriculum

Archived Documentation

Curriculum Overview

🆓 Free Tier - Foundation & Technical (Grades 1-6)

💎 Premium Tier - Advanced Skills (Grades 7-12)

🔐 Advanced Tier - AI Security Extension (Grades 13-15) ✨ NEW

How It Works

Agent Testing Mode

Traditional CLI Mode

Live Classroom Mode

Human Learning Mode (NEW)

Architecture

Reports

Testing

Project Status

Agent Testing Mode

Human Learning Mode (NEW)

Contributing

Agent Testing Mode

Human Learning Mode

General

Technology Stack

Backend

Frontend

License

Built With