@o-lang/llm-ollama
v1.0.5
Published
O-Lang resolver for local LLM inference via Ollama. Zero data leaves your infrastructure.
Maintainers
Readme
@o-lang/llm-ollama
O-Lang resolver for local LLM inference via Ollama.
Run any Ollama-compatible model — Llama 3, Mistral, Gemma, Phi, DeepSeek — inside governed O-Lang workflows. Zero data leaves your infrastructure.
Why this resolver?
Every other O-Lang LLM resolver (@o-lang/llm-groq, @o-lang/llm-openai) sends data to external APIs. For teams with data residency requirements, air-gapped environments, or a preference for local inference, this resolver closes that gap — with the same O-Lang governance guarantees: allowlists, intent scope, PII redaction, cryptographic audit chain.
File structure
resolver-llm-ollama/
├── .gitignore
├── .npmignore
├── badges/
│ └── llm-ollama-badge.svg
├── capability.js ← core LLM logic (buildMessages, executeLLM, MODELS)
├── capability.test.js ← unit tests (mocked fetch — no Ollama required)
├── conformance.json ← R-005 → R-012 conformance declarations
├── index.js ← O-Lang kernel interface (resolver function + metadata)
├── package.json
├── README.md
├── resolver.js ← O-Lang resolver declaration (inputs, outputs, failures)
└── test-local-resolver.js ← integration tests (requires live Ollama)Prerequisites
Install Ollama and pull a model:
# macOS brew install ollama # Linux curl -fsSL https://ollama.ai/install.sh | sh ollama pull llama3.2 # fast — good for bank/ICU style demos ollama pull llama3.1 # smart — good for RAG, multi-turn chat ollama serve # starts local API on :11434Install the kernel and this resolver:
npm install @o-lang/olang npm install @o-lang/llm-ollama
Usage in .ol workflows
v1 — simple prompt (bank / ICU style)
Workflow "Balance Check" with customer_id, user_question
Allow resolvers:
- @o-lang/bank-account-lookup
- @o-lang/llm-ollama
Step 1: Ask @o-lang/bank-account-lookup "{customer_id}"
Save as account_info
Step 2: Ask @o-lang/llm-ollama "{user_question}. Balance: {account_info.balance}"
Save as response
Return responsev2 — multi-turn RAG chatbot
Workflow "Private RAG" with user_question, document_path
Allow resolvers:
- @o-lang/semantic-doc-search
- @o-lang/llm-ollama
Prohibit actions:
- send_data_externally
Intent scope: "document_question_answering_only"
Step 1: Ask @o-lang/semantic-doc-search "{user_question}" from "{document_path}"
Save as context_chunks
Step 2: Ask @o-lang/llm-ollama "Answer using only this context:
{context_chunks.text}
Question: {user_question}"
Save as answer
Return answer.textEnvironment variables
| Variable | Default | Description |
|---|---|---|
| OLLAMA_HOST | http://localhost:11434 | Ollama server URL |
| OLLAMA_MODEL | llama3.1 | Default model for llm_chat |
| OLLAMA_MAX_TOKENS | 1024 | Default max tokens for llm_chat |
| OLLAMA_NUM_CTX | 4096 | Default context window |
Available models (MODELS constants)
| Constant | Model | Best for |
|---|---|---|
| MODELS.FAST | llama3.2 | Short answers, classification, v1 demos |
| MODELS.SMART | llama3.1 | RAG, multi-turn chat, document reasoning |
| MODELS.REASONING | deepseek-r1 | Complex synthesis, code, math |
Any model available via ollama list can be passed as model at call time.
Input schema (v2 llm_chat)
| Field | Type | Default | Description |
|---|---|---|---|
| user_message | string | required | The current user turn |
| system_prompt | string | — | System instruction |
| history | array | [] | Prior {role, content} turns |
| model | string | OLLAMA_MODEL env | Ollama model name |
| temperature | number 0–2 | 0.7 | Sampling temperature |
| max_tokens | integer | 1024 | Max response tokens (num_predict) |
| num_ctx | integer | 4096 | Context window size |
Output schema (v2 llm_chat)
| Field | Type | Description |
|---|---|---|
| reply | string | Model response |
| model | string | Model that generated the response |
| token_usage.prompt | integer | Prompt tokens used |
| token_usage.completion | integer | Completion tokens generated |
| token_usage.total | integer | Total tokens |
Error codes
| Code | Retries | Meaning |
|---|---|---|
| OLLAMA_NOT_RUNNING | 0 | Cannot connect — run ollama serve |
| MODEL_NOT_FOUND | 0 | Model not pulled — run ollama pull <model> |
| EMPTY_PROMPT | 0 | Prompt is blank |
| UNRESOLVED_PLACEHOLDERS | 0 | Prompt contains {variable} that wasn't interpolated |
| LLM_ERROR | 1 | Ollama API error (5xx) |
| TIMEOUT | 1 | Request exceeded timeout |
Running tests
# Unit tests (no Ollama required — uses mocked fetch)
npm test
# Integration tests (requires: ollama serve && ollama pull llama3.2)
npm run test:local
# O-Lang conformance suite (R-005 → R-012)
npm run conformancePublishing to the O-Lang registry
- Run conformance:
npm run conformance→ generatesbadge.txt - Publish to npm:
npm publish --access public - Submit at olang.cloud/registry
- O-Lang team independently re-runs the conformance suite before approving
License
MIT — community resolver, not affiliated with Ollama or Anthropic.
