ollama-proxy-fix
v0.2.1
Published
An ollama proxy which restores the original http methods of the ollama API. Developed primarily for RunPod as it's built-in proxy strips the original http methods but can be used for any hosting service.
Maintainers
Readme
Ollama Proxy
A lightweight, high-performance proxy server that restores the original HTTP methods of the Ollama API. Developed primarily for RunPod environments where the built-in proxy strips original HTTP methods, but can be used with any hosting service.
🚀 Features
- Full Ollama API Support: Proxies all native Ollama endpoints (
/api/*) - OpenAI Compatibility: Supports OpenAI-compatible endpoints (
/v1/*) - Streaming Support: Handles streaming responses for chat and generation endpoints
- CORS Enabled: Built-in CORS support for cross-origin requests
- Configurable Timeouts: Extended timeouts for long-running operations
- Error Handling: Robust error handling with detailed logging
- Environment Configuration: Flexible configuration via environment variables
- TypeScript: Written in TypeScript with full type safety
📋 Supported Endpoints
Native Ollama API Endpoints
POST /api/chat- Chat completionsPOST /api/generate- Text generationPOST /api/embeddings- Text embeddingsPOST /api/pull- Pull modelsPOST /api/push- Push modelsPOST /api/create- Create modelsPOST /api/copy- Copy modelsPOST /api/delete- Delete modelsPOST /api/show- Show model infoGET /api/tags- List modelsGET /api/ls- List modelsPOST /api/stop- Stop operationsGET /api/version- Get versionPOST /api/serve- Serve modelsPOST /api/unload- Unload models
OpenAI-Compatible Endpoints
POST /v1/chat/completions- Chat completionsPOST /v1/completions- Text completionsGET /v1/models- List modelsPOST /v1/embeddings- Text embeddings
🛠️ Installation
Using npx (Recommended)
npx ollama-proxy-fixUsing npm
npm install -g ollama-proxy-fix
ollama-proxy-fixFrom Source
git clone https://github.com/Jassu225/ollama-proxy.git
cd ollama-proxy
npm install
npm run build
npm start⚙️ Configuration
Pass env varialbles via command line or via the hosting environment to customize the proxy settings:
# Proxy Configuration
OLLAMA_PROXY_PORT=4000
OLLAMA_PROXY_REQUEST_TIMEOUT=120000 # 360000 (3 * OLLAMA_PROXY_REQUEST_TIMEOUT) for long running requests
OLLAMA_PROXY_REQUEST_BODY_LIMIT=50mb
# Ollama Server Configuration
OLLAMA_HOST=localhost
OLLAMA_PORT=11434Environment Variables
| Variable | Default | Description |
| --------------------------------- | ----------- | ------------------------------------------- |
| OLLAMA_PROXY_PORT | 4000 | Port for the proxy server |
| OLLAMA_PROXY_REQUEST_TIMEOUT | 120000 | Request timeout in milliseconds (2 minutes) |
| OLLAMA_PROXY_REQUEST_BODY_LIMIT | 50mb | Maximum request body size |
| OLLAMA_HOST | localhost | Ollama server hostname |
| OLLAMA_PORT | 11434 | Ollama server port |
🚀 Usage
Basic Usage
# Start the proxy server
npx ollama-proxy-fix
# The server will start on port 4000 (or your configured port)
# Proxying requests to Ollama at localhost:11434Development Mode
# Clone the repository
git clone https://github.com/Jassu225/ollama-proxy.git
cd ollama-proxy
# Install dependencies
npm install
# Start in development mode with hot reload
npm run dev
# Build for production
npm run build
# Start production server
npm startTesting the Proxy
Once the proxy is running, you can test it:
# Check if the proxy is running
curl http://localhost:4000
# Response:
# {
# "status": "running",
# "message": "Ollama Proxy Server is running!",
# "timestamp": "2025-07-28T06:37:21.249Z"
# }Example API Calls
Native Ollama API
# Chat completion
curl -X POST http://localhost:4000/api/chat \
-H "Content-Type: application/json" \
-d '{
"model": "llama2",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# List models
curl http://localhost:4000/api/tagsOpenAI-Compatible API
# Chat completion
curl -X POST http://localhost:4000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama2",
"messages": [{"role": "user", "content": "Hello!"}]
}'
# List models
curl http://localhost:4000/v1/models🔧 Advanced Features
Streaming Support
The proxy automatically handles streaming responses when stream: true is set in the request body:
curl -X POST http://localhost:4000/api/chat \
-H "Content-Type: application/json" \
-d '{
"model": "llama2",
"messages": [{"role": "user", "content": "Tell me a story"}],
"stream": true
}'Extended Timeouts
Long-running operations (pull, push, create, show) automatically get extended timeouts (3x the normal timeout) to handle large model operations.
CORS Support
The proxy includes built-in CORS headers for cross-origin requests:
Access-Control-Allow-Origin: *Access-Control-Allow-Methods: GET, POST, OPTIONSAccess-Control-Allow-Headers: Content-Type, Authorization
🐛 Troubleshooting
Common Issues
Connection Refused (503)
- Ensure Ollama is running on the configured host and port
- Check if
OLLAMA_HOSTandOLLAMA_PORTare correct
Request Timeout (504)
- Increase
OLLAMA_PROXY_REQUEST_TIMEOUTfor long-running operations - Check network connectivity to Ollama server
- Increase
Invalid JSON Body (400)
- Ensure request body is valid JSON
- Check
Content-Typeheader is set toapplication/json
📊 Performance
- Low Latency: Direct proxy with minimal overhead
- Memory Efficient: Streams responses without buffering
- Scalable: Handles multiple concurrent requests
- Reliable: Robust error handling and recovery
🤝 Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
📝 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- Built for RunPod environments where HTTP methods are stripped
- Compatible with any Ollama hosting service
- Inspired by the need for proper HTTP method preservation in proxy environments
📞 Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: [email protected]
Made with ❤️ for the Ollama community
