#16751 · @altendky · opened Mar 9, 2026 at 1:01 PM UTC · last updated Mar 21, 2026 at 8:48 PM UTC

fix(session): fix root causes and reconstruction of tool_use/tool_result mismatch (#16749)

appfix

+115−12 files

Score breakdown

Impact

9.0

Clarity

10.0

Urgency

8.0

Ease Of Review

9.0

Guidelines

9.0

Readiness

8.0

Size

6.0

Trust

8.0

Traction

0.0

Summary

This PR provides a three-layered defense-in-depth fix for a widespread and critical "tool_use/tool_result mismatch" error that corrupts user sessions. It addresses root causes in stream processing and adds a reconstruction-time safety net, supported by real-world evidence and comprehensive new tests.

Open in GitHub

Description

Issue for this PR

Closes #16749 Related: #10616, #8377, #2720, #1662, #5750, #2214, #8312, #8010

Type of change

[x] Bug fix
[ ] New feature
[ ] Refactor / code improvement
[ ] Documentation

What does this PR do?

Fixes the root causes and provides a reconstruction-time safety net for the widespread tool_use ids were found without tool_result blocks immediately after error that corrupts sessions and makes them unrecoverable.

The fix is three layers of defense-in-depth, each catching what the previous one misses:

Layer 1 — `processor.ts`: Tool-error race condition (line 211)

The tool-error handler only processed errors for tools in "running" status. Due to the AI SDK's merged-stream event ordering, tool-error can arrive before tool-call, when the tool is still "pending". The error was silently dropped, leaving the tool in "pending" state to be cleaned up later as "Tool execution aborted" with empty input: {}.

Fix: Accept tool-error for both "running" and "pending" status. Uses Date.now() as start time for pending tools (which don't have a time.start field).

Layer 2 — `processor.ts`: Recovery step-finish before retry (line 374)

When a stream error interrupts processing before finish-step is reached, or the finish-step handler itself throws, the step boundary is never written. The retry loop's continue creates a new stream whose events are appended to the same DB message without a step-finish/step-start boundary. Both steps' content merges into one message, and toModelMessages() produces a single assistant block with interleaved tool_use/text that the Anthropic API rejects.

Fix: Before continueing the retry loop, scan parts backward for an unclosed step (step-start without a matching step-finish). If found, write a recovery step-finish with reason: "error" and zero tokens/cost. Wrapped in try/catch so recovery failures don't block the retry.

Layer 3 — `message-v2.ts`: Synthetic step-start injection (line 623)

A reconstruction-time safety net that handles already-corrupted DB data regardless of how step boundaries were lost.

Fix: In toModelMessages(), track whether we've seen a tool part in the current step (sawTool flag). If text or reasoning appears after a tool part without an intervening step-start, inject a synthetic { type: "step-start" } to force the AI SDK to split content into separate assistant+tool blocks.

How layers interact

| Layer | Where it acts | What it prevents | |-------|--------------|-----------------| | Layer 1 (tool-error race) | Stream event handling | Silent error drops that leave tools in wrong state | | Layer 2 (recovery step-finish) | Retry loop, before continue | DB corruption at write time — ensures step boundaries are preserved | | Layer 3 (synthetic step-start) | Message reconstruction | Handles already-corrupted DB data + any future edge cases the above layers miss |

Real-world evidence

Session ses_32fb35486ffeeJAHmplKU1gB2t, message msg_cd05ba534001gICo48Lsy1NHWp:

part_id                        | type        | tool  | status | error
-------------------------------+-------------+-------+--------+------------------------
prt_cd05bb9ac001...            | step-start  |       |        |
prt_cd05bb9ad001...            | text        |       |        |
prt_cd05bb9f0001...            | tool        | write | error  | Tool execution aborted
                                                                 ← 96 SECOND GAP
prt_cd05d3273001...            | text        |       |        |
prt_cd05d35a8001...            | tool        | write | completed |
prt_cd05f3c5d001...            | step-finish |       |        |

The errored tool has input: {} — tool-error was dropped because status was "pending" (Layer 1 root cause)
No step-finish/step-start boundary between the two groups (Layer 2 root cause)
The 96-second gap is the retry delay

How did you verify your code works?

All 20 message-v2 tests pass, 0 failures
New test constructs the exact corrupted DB pattern (two merged steps with [step-start, text, tool(error), text, tool(completed)]) and asserts the structural invariant: no text or reasoning part appears after a tool-call part in the same assistant ModelMessage
Before the fix: Content types in this message: [text, tool-call, text, tool-call]
After the fix: passes (content split into separate blocks)
6 pre-existing compaction test failures unrelated to this change

Files changed

| File | Change | |------|--------| | packages/opencode/src/session/processor.ts | Tool-error race fix (accept "pending") + recovery step-finish before retry | | packages/opencode/src/session/message-v2.ts | Synthetic step-start injection in toModelMessages() | | packages/opencode/test/session/message-v2.test.ts | Test reproducing corrupted DB interleaving pattern |

Checklist

[x] I have tested my changes locally
[x] I have not included unrelated changes in this PR

Linked Issues

#16749 Missing step-finish/step-start parts after retryable stream errors cause tool_use/tool_result mismatch

View issue

Comments

No comments.

Changed Files

packages/opencode/src/session/message-v2.ts

+17−1

@@ -683,17 +683,32 @@ export namespace MessageV2 {

role: "assistant",

parts: [],

}

// Track whether we've seen a tool part in the current step.

// If text/reasoning appears after a tool part without an intervening

// step-start, it means a step boundary was lost (e.g. finish-step

// handler threw during a retryable error). Inject a synthetic

// step-start to force the AI SDK to split content into separate blocks,

// preventing invalid interleaved tool_use/text in one assistant message.

let sawTool = false

for (const part of msg.parts) {

if (part.type === "text" || part.type === "reasoning") {

if (sawTool) {

assistantMessage.parts.push({ type: "step-start" })

sawTool = false

}

if (part.type === "text")

assistantMessage.parts.push({

type: "text",

text: part.text,

...(differentModel ? {} : { providerMetadata: part.metadata }),

})

if (part.type === "step-start")

if (part.type === "step-start") {

packages/opencode/test/session/message-v2.test.ts

+98−0

@@ -788,6 +788,104 @@ describe("session.message-v2.toModelMessage", () => {

])

})

test("does not produce interleaved tool-call and text/reasoning in a single assistant block when step boundaries are missing", () => {

// When the finish-step handler in processor.ts throws during a retryable error,

// step-finish for step 1 and step-start for step 2 are never saved. Both steps'

// content merges into one DB message without boundaries. On replay,

// convertToModelMessages() produces a single assistant block with interleaved

// tool_use/reasoning/text, which the Anthropic API rejects with:

// "tool_use ids were found without tool_result blocks immediately after"

// or "Expected thinking or redacted_thinking, but found tool_use"

// Real-world DB evidence from session ses_32fb35486ffeeJAHmplKU1gB2t:

// step-start → text → tool(write, error, input={}) → [96s gap] → text → tool(write, completed) → step-finish

// The error tool had "Tool execution aborted" and empty input — no step boundary between the two steps.

const userID = "m-user"

const assistantID = "m-assistant"

const input: Mes