@avikalpa/codex-litellm
v0.55.0-2e2063ca-lite4e11ab
Published
Patched OpenAI Codex CLI with first-class LiteLLM support
Downloads
433
Readme
codex‑litellm
An unofficial, Apache‑2.0‑licensed patch set and distribution of the OpenAI Codex CLI with native LiteLLM support, multi‑layer caching, and production‑grade observability.
Upstream base:
openai/codex(Apache‑2.0). This project maintains a reproducible patch on top and ships binaries for convenience.
Highlights
- Direct LiteLLM Integration – Talk to LiteLLM backends natively; no extra proxy needed
- LiteLLM‑side Caching Support – Plays nicely with LiteLLM’s Redis/literal/semantic/provider KV caching (implemented on the LiteLLM side). See Wiki for setup
- Provider‑Agnostic – Works with OpenAI, Vercel AI Gateway, xAI, Google Vertex, and more through LiteLLM
- Serious Observability – Debug telemetry, session analytics, JSON logs,
/statuscommand - Cost Controls – Canonicalization + hashing, prompt segmentation, provider discounts, usage tracking
- Drop‑in Friendly – Fully compatible with upstream
codexUX, ships ascodex‑litellmbinary
Quick Start
Prerequisites
- At least one LLM provider API key
- A LiteLLM endpoint (self‑hosted or managed)
- CLI access
Note: The project is written in Rust but distributed as an npm package. A full Node.js dev setup is not required for install/use.
1) LiteLLM Backend Setup (LiteLLM only)
This fork focuses on native LiteLLM integration. Configure LiteLLM first; robust example configs are maintained in the Wiki.
Self‑hosted example
docker run -d -p 4000:4000 \
--name litellm-server \
litellm/litellm:latest \
--port 40002) Install codex‑litellm
npm install -g @avikalpa/codex-litellm
# verify
codex-litellm --versionOpenWrt
Download the .ipk for your architecture from the Releases page:
opkg install codex-litellm_<version>_<arch>.ipkTermux (Android)
Use the provided .deb artifacts:
dpkg -i codex-litellm_<version>_aarch64.deb # or _x86_64Installed at $PREFIX/bin/codex-litellm.
Optional: Alias
# shell profile (~/.bashrc, ~/.zshrc)
alias cdxl='codex-litellm'
# To keep config isolated from upstream codex:
alias cdxl='CODEX_HOME=~/.codex-litellm codex-litellm'
source ~/.bashrc # or ~/.zshrc3) Configure
Set the LiteLLM API base and key for codex‑litellm to talk to your LiteLLM instance. For robust, production‑style examples, see the Wiki.
export LITELLM_BASE_URL="http://localhost:4000"
export LITELLM_API_KEY="your-litellm-api-key"
mkdir -p ~/.codex-litellm
cat > ~/.codex-litellm/config.toml << 'EOF'
[general]
model_provider = "litellm"
api_base = "http://localhost:4000"
[litellm]
api_key = "your-litellm-api-key"
EOF4) Smoke Test
codex-litellm exec "What is the capital of France?"
codex-litellm exec "List files in current directory"
# Interactive mode
codex-litellmMore guides: Wiki (Quick Start, full configs, and routing recipes).
Architecture
- Patch Philosophy – Reproducible diff against upstream
openai/codex(inspired by GrapheneOS approach) - Dual Binary Strategy – Ships a separate
codex‑litellmbinary; does not disturb the stockcodexworkflow - LiteLLM‑Native – Direct REST integration; graceful fallback for non‑streaming providers
Caching Strategy
- Tier‑0 (Exact‑Match) – Canonicalization + SHA‑256 on prompts
- Tier‑1 (Literal) – Redis byte‑identical request caching via LiteLLM
- Tier‑2 (Semantic) – Embedding‑based similarity caching with tunable thresholds
- Tier‑3 (Provider) – Provider KV/prompt cache utilization when available
Observability
- Debug Telemetry – Onboarding, model routing, network calls
- Session Analytics – Token usage, cache hit‑rates, per‑model stats
- Structured Logs – JSON logs with size‑based rotation
- Live Status –
/statuscommand shows health, usage, and routing
Repository Layout
├── build.sh # Reproducible patch+build pipeline
├── stable-tag.patch # Patchset against upstream
├── config.toml # Sample configuration
├── docs/ # Project docs
│ ├── PROJECT_SUMMARY.md
│ ├── TODOS.md
│ ├── EXCLUSIVE_FEATURES.md
│ └── TELEMETRY.md
├── scripts/ # npm installer utilities
├── bin/ # launcher shim
├── litellm/ # LiteLLM integration modules
├── codex/ # Upstream checkout (excluded from VCS in releases unless noted)
└── dist/ # Built artifacts (gitignored)Configuration Notes
LiteLLM (minimal)
[general]
api_base = "http://your-litellm-proxy:4000"
model_provider = "litellm"
[litellm]
api_key = "<your-litellm-api-key>"
# Set base_url in LiteLLM itself for chosen providers.
```toml
[general]
api_base = "http://your-litellm-proxy:4000"
model_provider = "litellm"
[litellm]
# Provider endpoints are configured in your LiteLLM server; this CLI just talks to LiteLLM.Telemetry
[telemetry]
dir = "logs"
max_total_bytes = 104857600 # 100MB
[telemetry.logs.debug]
enabled = true
[telemetry.logs.session]
file = "codex-litellm-session.jsonl"Context Window
[general]
context_length = 130000Local Development
Requirements: Rust 1.70+, Node 18+, Redis (for caching features)
# clone & build
./build.sh
# Android/Termux cross‑compile
USE_CROSS=1 TARGET=aarch64-linux-android ./build.sh
# Dev loop (no full build)
export CODEX_HOME=$(pwd)/test-workspace/.codex
./setup-test-env.sh
cd codex
git checkout rust-v0.53.0
git apply ../stable-tag.patch
cargo build --bin codex
./codex-rs/target/debug/codex exec "test prompt"CI/CD
GitHub Actions builds artifacts for:
- Linux (x64, arm64)
- macOS (x64, arm64)
- Windows (x64, arm64)
- FreeBSD (x64)
- Illumos (x64)
- Android (arm64)
Releases attach binaries and publish to npm when GitHub Release is created.
Performance & Cost
Best Practices
- Tier‑0 exact‑match canonicalization + hashing
- Prompt segmentation – keep boilerplate separable
- Semantic thresholds – code: ~0.90; NL: 0.86–0.88
- Provider selection – prefer providers with KV/prompt cache discounts
Monitoring
codex-litellm /status
export RUST_LOG=debug
codex-litellm exec "your prompt" 2> debug.logTracks: tokens/session, cache hit‑rates per tier, latency/rate limits, cost per provider.
Troubleshooting
- "No assistant message" – Check LiteLLM connectivity, API permissions, inspect debug logs
- Low cache hit‑rate – Enable canonicalization; tune thresholds; segment prompts
- Context pressure – Reduce
context_length; use/compactto prune
# Deep debug mode
export RUST_LOG=debug
export CODEX_HOME=./debug-workspace
codex-litellm exec "debug test" 2> debug.logDocumentation
Project Docs
docs/EXCLUSIVE_FEATURES.md– LiteLLM‑only extrasdocs/PROJECT_SUMMARY.md– Current state & internalsdocs/TODOS.md– RoadmapAGENTS.md– Agent workflowsTASK.md– Daily dev notes
Wiki
- An Example of LiteLLM Configuration – production setup
- Model Routing Recipes – cost/latency trade‑offs
- Embedding Geometry Shootout – semantic cache tuning
- Agentic CLI Cost Playbook – budgeting patterns
Contributing
- Fork and create a feature branch
- Check
docs/TODOS.md - Add/update telemetry where relevant
- Keep patches focused; regenerate
stable-tag.patch - Open a PR with clear tests and notes
Licensing
- Repository contents: Apache License 2.0. See
LICENSE. - Upstream base:
openai/codex(Apache‑2.0).
Releases built from upstream sources bundle both LICENSE and NOTICE so downstream redistributors have the required notices.
Unofficial fork, no affiliation or endorsement implied.
Disclaimer: This project is provided “as‑is” with no warranty. Nothing here is legal advice. For edge‑case licensing questions, consult an attorney.
