#16751 · @altendky · opened Mar 9, 2026 at 1:01 PM UTC · last updated Mar 21, 2026 at 8:48 PM UTC

fix(session): fix root causes and reconstruction of tool_use/tool_result mismatch (#16749)

appfix
69
+11512 files

Score breakdown

Impact

9.0

Clarity

10.0

Urgency

8.0

Ease Of Review

9.0

Guidelines

9.0

Readiness

8.0

Size

6.0

Trust

8.0

Traction

0.0

Summary

This PR provides a three-layered defense-in-depth fix for a widespread and critical "tool_use/tool_result mismatch" error that corrupts user sessions. It addresses root causes in stream processing and adds a reconstruction-time safety net, supported by real-world evidence and comprehensive new tests.

Open in GitHub

Description

Issue for this PR

Closes #16749 Related: #10616, #8377, #2720, #1662, #5750, #2214, #8312, #8010

Type of change

  • [x] Bug fix
  • [ ] New feature
  • [ ] Refactor / code improvement
  • [ ] Documentation

What does this PR do?

Fixes the root causes and provides a reconstruction-time safety net for the widespread tool_use ids were found without tool_result blocks immediately after error that corrupts sessions and makes them unrecoverable.

The fix is three layers of defense-in-depth, each catching what the previous one misses:

Layer 1 — processor.ts: Tool-error race condition (line 211)

The tool-error handler only processed errors for tools in "running" status. Due to the AI SDK's merged-stream event ordering, tool-error can arrive before tool-call, when the tool is still "pending". The error was silently dropped, leaving the tool in "pending" state to be cleaned up later as "Tool execution aborted" with empty input: {}.

Fix: Accept tool-error for both "running" and "pending" status. Uses Date.now() as start time for pending tools (which don't have a time.start field).

Layer 2 — processor.ts: Recovery step-finish before retry (line 374)

When a stream error interrupts processing before finish-step is reached, or the finish-step handler itself throws, the step boundary is never written. The retry loop's continue creates a new stream whose events are appended to the same DB message without a step-finish/step-start boundary. Both steps' content merges into one message, and toModelMessages() produces a single assistant block with interleaved tool_use/text that the Anthropic API rejects.

Fix: Before continueing the retry loop, scan parts backward for an unclosed step (step-start without a matching step-finish). If found, write a recovery step-finish with reason: "error" and zero tokens/cost. Wrapped in try/catch so recovery failures don't block the retry.

Layer 3 — message-v2.ts: Synthetic step-start injection (line 623)

A reconstruction-time safety net that handles already-corrupted DB data regardless of how step boundaries were lost.

Fix: In toModelMessages(), track whether we've seen a tool part in the current step (sawTool flag). If text or reasoning appears after a tool part without an intervening step-start, inject a synthetic { type: "step-start" } to force the AI SDK to split content into separate assistant+tool blocks.

How layers interact

| Layer | Where it acts | What it prevents | |-------|--------------|-----------------| | Layer 1 (tool-error race) | Stream event handling | Silent error drops that leave tools in wrong state | | Layer 2 (recovery step-finish) | Retry loop, before continue | DB corruption at write time — ensures step boundaries are preserved | | Layer 3 (synthetic step-start) | Message reconstruction | Handles already-corrupted DB data + any future edge cases the above layers miss |

Real-world evidence

Session ses_32fb35486ffeeJAHmplKU1gB2t, message msg_cd05ba534001gICo48Lsy1NHWp:

part_id                        | type        | tool  | status | error
-------------------------------+-------------+-------+--------+------------------------
prt_cd05bb9ac001...            | step-start  |       |        |
prt_cd05bb9ad001...            | text        |       |        |
prt_cd05bb9f0001...            | tool        | write | error  | Tool execution aborted
                                                                 ← 96 SECOND GAP
prt_cd05d3273001...            | text        |       |        |
prt_cd05d35a8001...            | tool        | write | completed |
prt_cd05f3c5d001...            | step-finish |       |        |
  • The errored tool has input: {}tool-error was dropped because status was "pending" (Layer 1 root cause)
  • No step-finish/step-start boundary between the two groups (Layer 2 root cause)
  • The 96-second gap is the retry delay

How did you verify your code works?

  • All 20 message-v2 tests pass, 0 failures
  • New test constructs the exact corrupted DB pattern (two merged steps with [step-start, text, tool(error), text, tool(completed)]) and asserts the structural invariant: no text or reasoning part appears after a tool-call part in the same assistant ModelMessage
  • Before the fix: Content types in this message: [text, tool-call, text, tool-call]
  • After the fix: passes (content split into separate blocks)
  • 6 pre-existing compaction test failures unrelated to this change

Files changed

| File | Change | |------|--------| | packages/opencode/src/session/processor.ts | Tool-error race fix (accept "pending") + recovery step-finish before retry | | packages/opencode/src/session/message-v2.ts | Synthetic step-start injection in toModelMessages() | | packages/opencode/test/session/message-v2.test.ts | Test reproducing corrupted DB interleaving pattern |

Checklist

  • [x] I have tested my changes locally
  • [x] I have not included unrelated changes in this PR

Linked Issues

#16749 Missing step-finish/step-start parts after retryable stream errors cause tool_use/tool_result mismatch

View issue

Comments

No comments.

Changed Files

packages/opencode/src/session/message-v2.ts

+171
@@ -683,17 +683,32 @@ export namespace MessageV2 {
role: "assistant",
parts: [],
}
// Track whether we've seen a tool part in the current step.
// If text/reasoning appears after a tool part without an intervening
// step-start, it means a step boundary was lost (e.g. finish-step
// handler threw during a retryable error). Inject a synthetic
// step-start to force the AI SDK to split content into separate blocks,
// preventing invalid interleaved tool_use/text in one assistant message.
let sawTool = false
for (const part of msg.parts) {
if (part.type === "text" || part.type === "reasoning") {
if (sawTool) {
assistantMessage.parts.push({ type: "step-start" })
sawTool = false
}
}
if (part.type === "text")
assistantMessage.parts.push({
type: "text",
text: part.text,
...(differentModel ? {} : { providerMetadata: part.metadata }),
})
if (part.type === "step-start")
if (part.type === "step-start") {

packages/opencode/test/session/message-v2.test.ts

+980
@@ -788,6 +788,104 @@ describe("session.message-v2.toModelMessage", () => {
},
])
})
test("does not produce interleaved tool-call and text/reasoning in a single assistant block when step boundaries are missing", () => {
// When the finish-step handler in processor.ts throws during a retryable error,
// step-finish for step 1 and step-start for step 2 are never saved. Both steps'
// content merges into one DB message without boundaries. On replay,
// convertToModelMessages() produces a single assistant block with interleaved
// tool_use/reasoning/text, which the Anthropic API rejects with:
// "tool_use ids were found without tool_result blocks immediately after"
// or "Expected thinking or redacted_thinking, but found tool_use"
//
// Real-world DB evidence from session ses_32fb35486ffeeJAHmplKU1gB2t:
// step-start → text → tool(write, error, input={}) → [96s gap] → text → tool(write, completed) → step-finish
// The error tool had "Tool execution aborted" and empty input — no step boundary between the two steps.
const userID = "m-user"
const assistantID = "m-assistant"
const input: Mes