#18137 · @BYK · opened Mar 18, 2026 at 9:37 PM UTC · last updated Mar 21, 2026 at 10:24 AM UTC

fix(opencode): reduce memory usage during prompting with lazy boundary scan and context windowing

appfix

+148−55 files

Score breakdown

Impact

10.0

Clarity

10.0

Urgency

9.0

Ease Of Review

9.0

Guidelines

10.0

Readiness

7.0

Size

4.0

Trust

5.0

Traction

0.0

Summary

This PR implements two targeted optimizations, lazy boundary scanning and context windowing, to drastically reduce peak memory usage during prompting in opencode from 4-8GB down to ~1.2GB. This addresses a critical performance bottleneck identified in issue #18136.

Open in GitHub

Description

Issue for this PR

Closes #18136

Type of change

[x] Bug fix
[ ] New feature
[x] Refactor / code improvement
[ ] Documentation

What does this PR do?

Two targeted optimizations to reduce peak RSS during prompting from ~4-8GB down to ~1.2GB:

1. Lazy compaction boundary scan (filterCompactedLazy)

The prompt loop calls filterCompacted(stream(sessionID)) which streams ALL messages newest→oldest, loading parts for every message. For compacted sessions, most of those parts are discarded once the boundary is found.

New approach: probe the newest 50 message infos (1 DB query, no parts). If a compaction summary is detected, use a two-phase scan — info-only scan to find the boundary, then hydrate parts only for messages after it. If no compaction summary is found, fall back to the original single-pass filterCompacted(stream()) to avoid wasted info-only queries.

2. Context-window message windowing

toModelMessages was called with ALL messages (e.g., 7,704 for a long session), creating ModelMessage wrapper objects for every one. These flow through 4-5 copy layers (toModelMessages → convertToModelMessages → ProviderTransform.message → convertToLanguageModelPrompt), each creating ~60MB of wrapper objects.

Now the prompt loop estimates which messages from the tail fit in the LLM context window (model.limit.context × 4 chars/token) and only passes those to toModelMessages. For a 7,704-message session where ~200 fit, this cuts the conversion pipeline from ~300MB to ~10MB.

3. Prompt loop caching

The conversation is loaded once before the loop. On normal tool-call iterations, only the latest 200-message page is fetched and merged into the cache. Full reload only happens after compaction.

How did you verify your code works?

Monitored RSS with /proc/<PID>/status every 30s during active prompting
Before: peak 4.8GB, idle 1GB
After: peak 1.2GB, idle ~580MB
All session tests pass (118 pass, 4 skip, 0 fail)
Tested with both compacted and uncompacted sessions

Screenshots / recordings

Memory monitoring (30s intervals) after fix:

time,rss_mb,hwm_mb
20:46:02,942,1236    ← active prompting
20:48:02,1020,1236   ← peak during tool calls
20:50:02,606,1236    ← settled after activity
20:55:32,568,1236    ← stable idle

Checklist

[x] I have tested my changes locally
[x] I have not included unrelated changes in this PR

Linked Issues

#18136 perf: prompt loop loads entire conversation history into memory on every step

View issue

Comments

No comments.

Changed Files

packages/app/src/components/dialog-connect-provider.tsx

+1−1

@@ -383,7 +383,7 @@ export function DialogConnectProvider(props: { provider: string }) {

setFormStore("error", undefined)

await globalSDK.client.auth.set({

providerID: props.provider,

auth: {

body: {

type: "api",

key: apiKey,

packages/app/src/components/dialog-custom-provider.tsx

+1−1

@@ -131,7 +131,7 @@ export function DialogCustomProvider(props: Props) {

const auth = result.key

? globalSDK.client.auth.set({

providerID: result.providerID,

auth: {

body: {

type: "api",

key: result.key,

packages/opencode/src/cli/cmd/tui/component/dialog-provider.tsx

+1−1

@@ -265,7 +265,7 @@ function ApiMethod(props: ApiMethodProps) {

if (!value) return

await sdk.client.auth.set({

providerID: props.providerID,

auth: {

body: {

type: "api",

key: value,

packages/opencode/src/session/message-v2.ts

+102−0

@@ -897,6 +897,108 @@ export namespace MessageV2 {

return result

}

// ── Lightweight conversation loading ──────────────────────────────────

// filterCompactedLazy avoids materializing the full WithParts[] array.

// Phase 1: scan message *info only* (no parts) newest→oldest to find

// the compaction boundary and collect message IDs.

// Phase 2: load parts only for messages after the boundary.

// For a 7,000-message session with no compaction this still loads all

// parts, but for compacted sessions it skips everything before the

// summary — which is the common case for long-running sessions.

/** Scan info-only (no parts) newest→oldest. Returns message rows from

* the compaction boundary forward, in oldest-first order. */

async function scanBoundary(sessionID: SessionID) {

const size = 50

let before: string | undefined

const rows: (typeof MessageTable.$inferSelect)[] = []

const completed = new Set<string>()

while (true) {

const cursor_before = before ? cursor.decode(before) : undefined

const where = cursor_before

? and(eq(MessageTable.session_id, sessionID), older(c

packages/opencode/src/session/prompt.ts

+43−2

@@ -295,11 +295,20 @@ export namespace SessionPrompt {

let step = 0

const session = await Session.get(sessionID)

// filterCompactedLazy scans message info without loading parts to find

// the compaction boundary, then hydrates parts only for messages after

// it. For a 7K-message session with compaction at message #100, this

// loads ~100 messages' parts instead of all 7K.

let msgs = await MessageV2.filterCompactedLazy(sessionID)

let needsFullReload = false

while (true) {

if (needsFullReload) {

msgs = await MessageV2.filterCompactedLazy(sessionID)

needsFullReload = false

}

SessionStatus.set(sessionID, { type: "busy" })

log.info("loop", { step, sessionID })

if (abort.aborted) break

let msgs = await MessageV2.filterCompacted(MessageV2.stream(sessionID))

let lastUser: MessageV2.User | undefined

let lastAssistant: MessageV2.Assistant | undefined

@@ -526,6 +535,7 @@ export namespace SessionPrompt {

} satisfies MessageV2.TextPart)

}

needsFullReload = true

continue

}

@@ -540,6 +550,7 @@ export namespace SessionPrompt {