opencode-deepseek-thinking-fix
v0.2.5
Published
OpenCode plugin that preserves DeepSeek-V4 thinking content across turns (Anthropic + OpenAI protocols), with optional placeholder fallback to rescue broken conversations.
Maintainers
Readme
opencode-deepseek-thinking-fix
OpenCode plugin that preserves DeepSeek-V4 (and other reasoning models') thinking content across turns, fixing errors like:
The `content[].thinking` in the thinking mode must be passed back to the API.messages.X.content.0.type: Expected 'thinking' or 'redacted_thinking', but found 'tool_use'Supports both upstream protocols. Detection is automatic.
| Protocol | Endpoint | What's preserved |
|---|---|---|
| Anthropic-compatible | /v1/messages | content[] thinking / redacted_thinking blocks with signature |
| OpenAI-compatible | /v1/chat/completions | reasoning_content (and reasoning fallback) |
Install
One line in your opencode config:
{
"$schema": "https://opencode.ai/config.json",
"plugin": ["opencode-deepseek-thinking-fix"]
}OpenCode will auto-install the package from npm on next startup. If you prefer to install it yourself:
cd ~/.config/opencode
bun add opencode-deepseek-thinking-fix
# or: npm i opencode-deepseek-thinking-fix
# or: pnpm add opencode-deepseek-thinking-fixThat's it. Any provider whose id contains deepseek will have its fetch wrapped automatically.
Local development
{
"plugin": ["file:///absolute/path/to/opencode-deepseek-thinking-fix/dist/index.js"]
}Verify it's loaded
Start opencode, send a message that triggers thinking, and set debugThinking: true on the provider to see log lines:
{
"provider": {
"deepseek": {
"options": { "debugThinking": true }
}
}
}You should see entries like [deepseek-thinking-fix][anthropic] cached N thinking blocks (stream) during the first turn and re-injected thinking blocks on subsequent turns.
What it actually does
- Hooks
configto wrapprovider.options.fetchon every matching provider. - On request: inspects the JSON body, identifies the protocol, and re-injects cached thinking content onto any assistant history message that is missing it. For Anthropic-shaped requests it also makes sure the
thinkingparameter stays enabled so the server keeps producing blocks. - On response: parses the body (streamed or non-streamed) and caches the thinking content against a stable fingerprint of the message prefix.
- If anything unexpected happens, the original request is passed through untouched — the plugin never breaks a chat.
When does it trigger?
Only when all of the following hold; otherwise the request is a no-op pass-through:
- Provider id contains one of:
deepseek,deep-seek,ds-v4,dsv4(case-insensitive). init.bodyis a JSON string with amessagesarray.- URL or body shape looks like Anthropic Messages or OpenAI Chat Completions.
- Response is
2xx.
Protocol detection
- URL contains
/v1/messagesor ends with/messages→ Anthropic. - URL contains
/chat/completions→ OpenAI. - Fallback by body shape:
- top-level
systemfield orthinkingparam → Anthropic system/developerrole insidemessages, orreasoning_effortat the top → OpenAI
- top-level
- Otherwise: pass through unchanged.
Programmatic use
Use the lower-level helper directly if you are embedding opencode or building your own fetch chain:
import { wrapFetchForDeepSeekThinking } from "opencode-deepseek-thinking-fix";
const fetchWithFix = wrapFetchForDeepSeekThinking(globalThis.fetch, {
debug: true,
ttlMs: 30 * 60 * 1000,
ensureThinkingEnabled: true,
defaultBudgetTokens: 8000,
handleOpenAI: true,
});Options
| Option | Default | Description |
|---|---|---|
| ttlMs | 1800000 (30 min) | TTL of the thinking-content cache |
| debug | false | Extra console.log lines for injection / cache hits |
| ensureThinkingEnabled | true | Auto-add thinking: { type: "enabled", budget_tokens: N } to Anthropic requests |
| defaultBudgetTokens | 8000 | Budget tokens used by the above |
| handleOpenAI | true | Also handle OpenAI-compatible /chat/completions bodies |
| placeholder.mode | "fallback" | off / fallback / always — inject placeholder reasoning_content to rescue old conversations (OpenAI path only) |
| placeholder.text | "(thinking omitted)" | Text used when a placeholder is injected. Avoid pure-whitespace values; some relays will trim them back to empty. |
| placeholder.field | "reasoning_content" | Which field to fill: reasoning_content, reasoning, or both |
Rescuing previously failed conversations
If you already have sessions that were broken by the "must pass back thinking" error, you can't recover the real thinking content — it was never captured. For OpenAI-compat endpoints DeepSeek only checks that reasoning_content exists and is non-empty, not its signature, so a harmless placeholder is enough to get the history to replay.
This is exactly what placeholder.mode = "fallback" does (on by default): when cache misses, every assistant message in the request gets a reasoning_content placeholder so the request goes through. Going forward, new turns will be captured for real and will replace the placeholder.
Configure it via the opencode provider options:
{
"provider": {
"deepseek": {
"options": {
"debugThinking": true,
"thinkingPlaceholder": { "mode": "fallback", "text": "(omitted)" }
}
}
}
}Turn it off if you'd rather see the error than risk sending a fake reasoning field:
{ "provider": { "deepseek": { "options": { "thinkingPlaceholder": { "mode": "off" } } } } }Note: Anthropic-compatible requests (/v1/messages) never get placeholders. The signature on thinking blocks is cryptographic and can't be faked — faking it would just produce a different error. For those endpoints the cache-based replay is your only option; conversations that never went through the plugin cannot be recovered.
How the fingerprint works
Thinking content is cached under an FNV-1a hash of the message prefix that preceded the assistant turn. The hash excludes previously-cached thinking blocks, signatures, and text whitespace, and includes model, role, text content, tool_use / tool_result payloads, and the protocol tag. That way the Anthropic and OpenAI caches never collide, and a retry with the same history always hits the same slot.
FAQ
Does it work with newapi / OpenRouter / one-api / LiteLLM?
Yes, as long as you name the opencode provider with deepseek in its id (or any of the fallback tokens) and the relay forwards either Anthropic /v1/messages or OpenAI /chat/completions traffic.
Does it touch non-DeepSeek providers? No. Providers whose id doesn't match are left completely alone.
Does it break non-thinking / non-reasoning calls? No. If the request doesn't have an assistant history that needs patching, nothing changes. If the response contains no thinking content, nothing is cached.
Where is the cache? In-memory, per opencode process. TTL is 30 minutes by default.
Will it leak thinking content to the server? It only re-sends the thinking blocks that the server itself returned in the previous turn. Nothing new is synthesized.
Build from source
bun install
bun run buildProduces dist/ with ESM + .d.ts + sourcemaps.
Links
- npm: https://www.npmjs.com/package/opencode-deepseek-thinking-fix
- OpenCode plugins docs: https://opencode.ai/docs/plugins/
- Related upstream issue: https://github.com/anomalyco/opencode/issues/16748
License
MIT
