kai-k8s
v1.3.1
Published
Kubernetes AI - Your intelligent K8s assistant powered by GitHub Copilot SDK
Downloads
38
Maintainers
Readme
Overview
kai is a command-line AI assistant that helps you manage, troubleshoot, and understand your Kubernetes clusters using natural language. Instead of memorizing kubectl commands, just ask questions like "why is my pod crashing?" and kai figures out what to do.
Powered by the GitHub Copilot SDK, kai combines the intelligence of large language models with direct access to your cluster through kubectl and helm.
Features
- 🗣️ Natural Language Interface — Ask questions in plain English, get answers instantly
- 🤖 Model Selection — Choose your AI model (GPT-5, Claude, etc.) with
/modelcommand - 🔄 Context Switching — Switch K8s contexts and namespaces interactively
- 💾 Session Persistence — Save and resume conversations across sessions
- 📎 File Attachments — Attach YAML manifests directly in your prompts
- 🔑 BYOK Mode — Use your own OpenAI/Azure API key
- ⚡ Direct Command Execution — Runs
kubectlandhelmcommands automatically - 🔍 Smart Troubleshooting — Follows systematic debugging workflows
- 📡 Streaming Responses — See answers as they're generated in real-time
- 🎨 Beautiful CLI — Animated banners, colored output, and styled command boxes
- ⚡ Quick Actions — AI-suggested follow-up actions you can execute with a single keystroke
- 👁️ Live Watch — Monitor resources in real-time with
/watchcommand - 🎓 Learning Mode — Educational annotations that teach K8s concepts as you work
- 🧩 Skills System — Extend kai with custom domain expertise and commands
Prerequisites
Before installing kai, ensure you have the following:
| Requirement | Version | Notes |
|-------------|---------|-------|
| Node.js | 18.0+ | Download |
| kubectl | Any | Must be configured and connected to your cluster |
| GitHub CLI | 2.0+ | For authentication (gh auth login) |
| GitHub Copilot | — | Active subscription (Individual, Business, or Enterprise) |
| Copilot CLI | Latest | npm install -g @github/copilot (auto-installed with VS Code Copilot Chat) |
| helm | 3.0+ | Optional, for Helm-related queries |
Authentication
kai uses GitHub Copilot for AI capabilities. Authenticate using one of these methods:
# Recommended: Use GitHub CLI
gh auth login
# Alternative: Set environment variable
export COPILOT_GITHUB_TOKEN=your_token_here
# Or use standard GitHub tokens
export GH_TOKEN=your_token_hereInstallation
From npm (Recommended)
# Install globally
npm install -g kai-k8s
# Run kai
kaiFrom Source
# Clone the repository
git clone https://github.com/asaf5767/kai.git
cd kai
# Install dependencies
npm install
# Run directly
npm start
# Or build and install globally
npm run build
npm link
kai # Now available globallyQuick Start
# Via npm (easiest)
npm install -g kai-k8s && kai
# Or from source
git clone https://github.com/asaf5767/kai.git && cd kai && npm install && npm startUsage
Starting kai
# Default: Start with GPT-5
npm start
# Choose a specific model
kai --model claude-sonnet-4.5
# Resume a previous session
kai --session my-debug-session
# Use your own API key (BYOK)
export OPENAI_API_KEY=sk-...
kai --byokCLI Options
| Option | Description |
|--------|-------------|
| -h, --help | Show help message |
| -v, --version | Show version number |
| -m, --model <name> | Use specific AI model (e.g., gpt-5, claude-sonnet-4.5) |
| -s, --session <id> | Resume a specific session |
| -l, --learn | Enable learning mode (educational annotations) |
| --byok | Use your own API key (requires OPENAI_API_KEY env var) |
Commands
Interactive Commands
| Command | Description |
|---------|-------------|
| /model [name] | Switch AI model or list available models |
| /context [name] | Switch Kubernetes context or list available |
| /ns [name] | Switch namespace or list available |
| /watch [resource] | Live resource monitoring (Ctrl+C to stop) |
| /learn [on\|off] | Toggle learning mode (educational annotations) |
| /skills | List installed skills and commands |
| /skills reload | Reload skills from disk |
| /sessions | List all saved sessions |
| /resume <id> | Resume a previous session |
| /forget <id> | Delete a saved session |
| /new | Start a fresh session |
| /history [n] | View command history, re-run by number, or search |
| /help | Show all commands |
| clear | Clear screen |
| exit | Quit kai |
File Attachments
Include files directly in your prompts using bracket syntax:
> apply this [./deployment.yaml]
> compare [./staging.yaml] with [./prod.yaml]
> what's wrong with [./service.yaml]?Supported formats: .yaml, .yml, .json, .txt, .log
Example Session
╭─────────────────────────────────────╮
│ kai - Kubernetes AI Assistant │
│ Model: gpt-5 │
╰─────────────────────────────────────╯
✓ Connected to GitHub Copilot
ℹ Ask me anything about your Kubernetes cluster!
> /context
ℹ Available Kubernetes Contexts:
● prod-cluster
staging-cluster
dev-cluster
Use /context <name> to switch
> /context staging-cluster
✅ Switched to context: staging-cluster
> what pods are crashing in the api namespace?
┌─ Step 1 ──────────────────────────────┐
│ kubectl get pods -n api │
└───────────────────────────────────────┘
┌─ Step 2 ──────────────────────────────┐
│ kubectl describe pod api-server-xyz │
└───────────────────────────────────────┘
│ Found 2 pods in CrashLoopBackOff state:
│
│ 1. **api-server-xyz** - OOMKilled
│ The container is running out of memory.
│ Current limit: 256Mi
│
│ Recommendation: Increase memory limit to 512Mi
│
│ 2. **worker-abc** - ImagePullBackOff
│ Cannot pull image `myregistry/worker:v2.1`
│
│ Recommendation: Check registry credentials
> /model claude-sonnet-4.5
✅ Switched to model: claude-sonnet-4.5
> apply this [./fix-memory.yaml]
ℹ Attached 1 file(s): fix-memory.yaml
│ I'll apply the memory fix...Quick Actions
After identifying a problem, kai suggests numbered quick actions that you can execute with a single keystroke:
> why is my pod crashing?
│ Found pod nginx-pod in CrashLoopBackOff state.
│ The container is running out of memory (OOMKilled).
│ Current limit: 256Mi
╭─ ⚡ Quick Actions ───────────────────────────╮
│ 1) View logs for nginx-pod │
│ 2) View previous container logs │
│ 3) Describe pod │
│ 4) Increase memory limit │
╰───────────────────────────────────────────────╯
> 1
│ Executing: View logs for nginx-pod
│
│ [pod logs displayed...]Simply type a number (1-9) to execute the corresponding action. The quick actions are context-aware and suggest relevant next steps based on the problem kai identified.
Live Watch
Monitor Kubernetes resources in real-time with the /watch command:
> /watch pods -n default
🔄 Watching pods in default
Press Ctrl+C to stop watching
NAME READY STATUS RESTARTS AGE
nginx-abc 1/1 Running 0 2d
api-xyz 1/1 Running 0 1d
worker-123 0/1 Pending 0 5m
Watch stoppedWatch Options
| Command | Description |
|---------|-------------|
| /watch pods | Watch pods in current namespace |
| /watch pods -n kube-system | Watch pods in specific namespace |
| /watch deploy | Watch deployments |
| /watch svc | Watch services |
| /watch nodes | Watch cluster nodes |
| /watch pods -A | Watch across all namespaces |
| /watch pods -l app=nginx | Watch pods matching label selector |
Press Ctrl+C to stop watching and return to the normal REPL.
🎓 Learning Mode
Learning Mode transforms kai from a "do it for me" assistant into a "teach me while helping" mentor. When enabled, kai adds educational annotations after every command, explaining what it did and why.
Why Learning Mode?
A common criticism of AI assistants is that they become "crutches" — users get things done but never learn the underlying skills. Learning mode counters this by teaching K8s concepts as you work.
Enabling Learning Mode
# Start kai with learning mode
kai --learn
# Or toggle during a session
/learn on
/learn off
/learn # ToggleNote: Changes take effect on the next session (restart kai or use
/new)
What You'll Learn
In learning mode, after every kubectl or helm command, kai adds:
- 📚 Learn: What each part of the command does
- 🔍 Notice: What to look for in the output
- 💡 Why: Reasoning behind the approach (when relevant)
Example: Normal vs Learning Mode
Normal Mode:
> why is my pod crashing?
┌─ Step 1 ──────────────────────────────────┐
│ kubectl logs nginx-abc --previous │
└────────────────────────────────────────────┘
Error: Config file not found at /app/config.yaml
Your pod is crashing because it can't find the config file.Learning Mode (kai --learn):
> why is my pod crashing?
┌─ Step 1 ──────────────────────────────────┐
│ kubectl logs nginx-abc --previous │
└────────────────────────────────────────────┘
Error: Config file not found at /app/config.yaml
📚 Learn:
• logs → View container stdout/stderr
• --previous → CRITICAL for CrashLoopBackOff! Gets logs from crashed container
(without it, you'd see empty logs from the fresh restart)
🔍 Notice: The error message shows the exact file path that's missing
💡 Next time you see CrashLoopBackOff, always use --previous to see why it crashed.
Your pod is crashing because it can't find the config file.Key Concepts You'll Learn
- Flag explanations:
--previous,-o wide,-lselectors,--dry-run=client - When to use which command:
getvsdescribevslogs - Patterns to recognize: Error messages, status codes, event types
- Pro tips: Advanced techniques and shortcuts
Best Practices
- Start with learning mode ON when you're new to K8s
- Turn it OFF when you're in a hurry or already know the commands
- Toggle as needed — it's designed to be low-friction
🧩 Skills System
Skills are the most powerful feature of kai — they transform kai from a generic Kubernetes assistant into a domain expert for YOUR specific infrastructure.
A skill can provide:
- Custom commands — Shortcuts like
/myteam:deploy prod - Domain knowledge — AI expertise specific to your architecture
- Quick action templates — Context-aware suggestions
- Environment variables — Team-specific configurations
Why Skills?
Without skills, kai is a general K8s expert. With skills, kai becomes YOUR team's expert:
| Without Skills | With Skills |
|----------------|-------------|
| Generic K8s knowledge | Deep understanding of YOUR architecture |
| Manual kubectl commands | /myteam:status shortcuts |
| Generic troubleshooting | "I know your pod naming convention..." |
| Generic suggestions | Team-specific quick actions |
Installing Skills
Skills live in two locations (higher priority first):
1. Project-local: .kai/skills/<skill-name>/
2. User global: ~/.config/kai/skills/<skill-name>/Install from GitHub:
# Clone to user skills directory
git clone https://github.com/myteam/kai-skills ~/.config/kai/skills/myteam
# Or as project submodule
git submodule add https://github.com/myteam/kai-skills .kai/skills/myteam
# Reload skills in kai
/skills reloadManaging Skills
> /skills
🧩 kai Skills
──────────────────────────────────────
📁 Project Skills
● myteam v1.0.0
My Team Kubernetes Expert · 8 commands
👤 User Skills
● helm-ops v1.0.0
Helm Operations Expert · 8 commands
⚡ Available Skill Commands:
myteam:
/myteam:status - Show cluster status
/myteam:deploy - Deploy to environment
...
> /skills info myteam
[Detailed skill information]
> /skills reload
✅ Reloaded 2 skill(s)Using Skill Commands
Skill commands use the format /<skill>:<command>:
> /helm-ops:releases
NAME NAMESPACE STATUS
nginx-ingress ingress deployed
redis default deployed
> /myteam:deploy prod --version=2.0
⚠️ This command requires confirmation:
helm upgrade myapp ./charts/myapp -n production --set version=2.0
Proceed? [y/N] y
✅ myteam:deploy completedCreating Your Own Skill
Create a skill.yaml file:
apiVersion: kai.ms/v1
kind: Skill
metadata:
name: my-skill
displayName: My Custom Skill
description: Custom K8s operations for my team
version: 1.0.0
spec:
commands:
- name: status
description: Show cluster health
script: kubectl get pods -A | grep -v Running
- name: restart
description: Restart a deployment
args:
- name: deployment
description: Deployment name
required: true
script: kubectl rollout restart deployment/${deployment}
confirm: true # Requires Y/N confirmation
prompts:
inline: |
## My Team Expertise
When troubleshooting this cluster:
- Our apps run in the 'production' namespace
- Deployments use prefix 'myapp-'
- Check ConfigMaps first for config issuesSee examples/skills/helm-ops/ for a complete example.
Skill Capabilities
| Feature | Description |
|---------|-------------|
| commands | Custom /skill:cmd slash commands |
| prompts | Domain knowledge injected into AI context |
| quickActions | Pre-defined action templates |
| env | Environment variables for scripts |
| contextPatterns | Auto-activate for matching K8s contexts |
| confirm | Require Y/N for destructive commands |
Security
Skills are secure by design:
- ✅ YAML-only — No executable code, just configuration
- ✅ Visible execution — All commands shown before running
- ✅ Confirmation gates — Destructive commands require explicit approval
- ✅ User-controlled — You install and enable skills
How It Works
kai leverages the GitHub Copilot SDK's built-in tools for shell command execution:
┌─────────────────────────────────────────────────────────┐
│ kai │
├─────────────────────────────────────────────────────────┤
│ User Question │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────┐ │
│ │ GitHub Copilot SDK │ │
│ │ ┌─────────────┐ ┌──────────────────────┐ │ │
│ │ │ AI Model │───▶│ Built-in Shell Tool │ │ │
│ │ └─────────────┘ └──────────────────────┘ │ │
│ └─────────────────────────────────────────────────┘ │
│ │ │ │
│ │ ▼ │
│ │ kubectl / helm │
│ │ │ │
│ │ ▼ │
│ │ Kubernetes Cluster │
│ │ │ │
│ ▼ ▼ │
│ Intelligent Response + Command Output │
└─────────────────────────────────────────────────────────┘Project Structure
kai/
├── src/
│ ├── index.ts # Main entry: CLI args, REPL loop
│ ├── cli/
│ │ ├── commands.ts # Slash command handlers (incl. skill commands)
│ │ ├── ui.ts # Colors, spinners, status bar
│ │ ├── frames.ts # Boxed UI components
│ │ ├── banner.ts # Animated ASCII banner
│ │ ├── quick-actions.ts # Quick action parsing and selection
│ │ ├── watch.ts # Live resource monitoring
│ │ └── autocomplete.ts # Tab completion
│ ├── config/
│ │ ├── system-prompt.ts # K8s expert system prompt + skill injection
│ │ └── preferences.ts # User preferences and history
│ └── skills/ # 🧩 Skills System
│ ├── index.ts # Skills module exports
│ ├── types.ts # TypeScript interfaces
│ ├── loader.ts # Skill discovery and loading
│ └── registry.ts # Command registration, prompt composition
├── examples/
│ └── skills/ # Example skills
│ └── helm-ops/ # Helm operations skill
├── plans/
│ └── kai-skills-architecture.md # Skills system design doc
├── package.json
├── tsconfig.json
└── README.mdDevelopment
# Run with hot reload
npm run dev
# Build for production
npm run build
# Type checking
npx tsc --noEmitTroubleshooting
Protocol Version Mismatch
If you see "SDK protocol version mismatch" error:
# Update the Copilot CLI to the latest version
npm update -g @github/copilot
# Verify the update
copilot --versionThis happens when the Copilot CLI is outdated and doesn't support the SDK's protocol version.
403 Error on Startup
You're using a GitHub account without Copilot access.
# Switch to an account with Copilot subscription
gh auth loginkubectl Not Found
Ensure kubectl is installed and in your PATH:
kubectl version --clientBYOK (Bring Your Own Key)
Use --byok to use your own API key instead of GitHub Copilot. The SDK supports:
OpenAI:
export OPENAI_API_KEY=sk-your-key-here
# Optional: custom endpoint (for proxies, local models)
export OPENAI_API_BASE=https://api.openai.com/v1
kai --byokAzure OpenAI:
export AZURE_OPENAI_API_KEY=your-azure-key
export AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com
# Optional: API version (default: 2024-10-21)
export AZURE_OPENAI_API_VERSION=2024-10-21
kai --byokAnthropic:
export ANTHROPIC_API_KEY=sk-ant-your-key
# Optional: custom endpoint
export ANTHROPIC_API_BASE=https://api.anthropic.com
kai --byokContributing
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
License
This project is licensed under the MIT License - see the LICENSE file for details.
