fix(agent): size snapcompact maxFrames by the live model window by roboomp · Pull Request #3249 · can1357/oh-my-pi

roboomp · 2026-06-22T10:04:47Z

Repro

Long-running sessions on any sub-1M-token model (Claude Sonnet 4.5 / Haiku 4.x, GPT-5.x, Gemini 2.x) loop the warning

Warning: compaction: snapcompact could not bring the context under the limit — using an LLM summary instead
Auto context-full maintenance cancelled

on every threshold tick. The numbers reproduce deterministically with the bundled catalog:

import * as snapcompact from "@oh-my-pi/snapcompact";
const sonnet = { id: "claude-sonnet-4-5", api: "anthropic-messages", input: ["text","image"] };
const shape = snapcompact.resolveShape(sonnet);
const huge = "foo bar baz ".repeat(Math.ceil(80 * snapcompact.geometry(shape).capacity / 12) + 1000);
const planned = Math.min(snapcompact.frames(huge, { shape }), snapcompact.MAX_FRAMES_DEFAULT);
const projected = 50_000 + planned * snapcompact.FRAME_TOKEN_ESTIMATE;
const budget = 200_000 - Math.max(0.15 * 200_000, 16384);
console.log({ planned, projected, budget, rejected: projected > budget });
// { planned: 80, projected: 451920, budget: 170000, rejected: true }

Cause

AgentSession#compact and AgentSession.#runAutoCompaction invoked snapcompact.compact(preparation, { convertToLlm, model, shape }) with no maxFrames override, so snapcompact fell through to MAX_FRAMES_DEFAULT = 80. The post-render projection in AgentSession.#projectSnapcompactContextTokens charges FRAME_TOKEN_ESTIMATE = 5024 per frame block (the conservative high-res Anthropic ceiling — see packages/agent/src/compaction/compaction.ts:362-368), making 80 × 5024 ≈ 402k frame-token projections.

For a Claude Sonnet–class 200k window the usable budget is 200000 − max(15%, reserve) = 170000, so the projection rejects the snapcompact result before it ever lands, downgrades to action = "context-full", and re-emits the warning every threshold trigger. MAX_FRAMES_DEFAULT = 80 was sized for Opus 4.7's 1M window (packages/snapcompact/src/snapcompact.ts:387-394) but no per-caller cap was wired in to scale it down for smaller windows.

Fix

Added AgentSession.#computeSnapcompactMaxFrames(preparation, settings) which sizes the frame cap from the live (contextWindow − effectiveReserveTokens − computeNonMessageTokens − Σ estimateTokens(recentMessages) − 4000 summary-text reserve) / snapcompact.FRAME_TOKEN_ESTIMATE, clamped to [0, MAX_FRAMES_DEFAULT]. Uses the same per-frame charge the projection uses, so picking maxFrames from this helper makes the projection succeed by construction.
Threaded the helper into the auto-compaction path (#runAutoCompaction, near the prior action === "snapcompact" block) and the manual /compact path (compact(), near the prior snapcompactReady block). snapcompact.compact(preparation, { … , maxFrames }) is now called with the budget-sized cap.
When maxFrames < 1 (kept-recent alone exhausts the window) snapcompact is skipped entirely instead of running just to be rejected, and the user-facing notice now reads "snapcompact: kept history alone exceeds the context budget — using an LLM summary instead" instead of the misleading "could not bring the context under the limit" (which implied snapcompact had actually produced something). The downstream projection check is kept as a defensive guard for summary-text drift.

Single fixed-cost commit; no caller signatures changed.

Verification

bun --cwd packages/coding-agent check (Biome + tsgo) — passes.
bun --cwd packages/coding-agent test test/compaction.test.ts test/compaction-lifecycle.test.ts test/compaction-hooks.test.ts test/compaction-serialization.test.ts test/agent-session-auto-compaction-queue.test.ts test/agent-session-eager-compaction.test.ts test/snapcompact-inline.test.ts test/agent-session-snapcompact-budget.test.ts — 77 pass, 9 skip, 0 fail across 8 files.
New regression packages/coding-agent/test/agent-session-snapcompact-budget.test.ts (2 cases):
- asserts that for a 200k-window Sonnet session the captured snapcompact.compact() maxFrames is < MAX_FRAMES_DEFAULT, > 0, and maxFrames × FRAME_TOKEN_ESTIMATE < (window − reserve);
- asserts that with one synthetic kept-recent message larger than the model window, snapcompact.compact() is never invoked and the "kept history alone exceeds the context budget" notice fires instead.
  Both fail without this diff (confirmed by stashing src/session/agent-session.ts) and pass with it.

Fixes #3247

Snapcompact's bundled MAX_FRAMES_DEFAULT (80) × FRAME_TOKEN_ESTIMATE (5024) ≈ 402k tokens worth of frames. AgentSession was calling snapcompact.compact() with no maxFrames override, so the post-render projection inside #runAuto Compaction / compact() always overflowed the budget on any sub-1M-token window (Claude Sonnet 4.5's 200k = 170k usable, the 80-frame projection alone clears that 2.4×), looping the 'snapcompact could not bring the context under the limit — using an LLM summary instead' warning on every threshold tick. AgentSession.#computeSnapcompactMaxFrames now sizes the frame cap from the resolved budget — (window − reserve − non-message − kept-recent − summary-text reserve) / FRAME_TOKEN_ESTIMATE, clamped to MAX_FRAMES_DEFAULT — and threads it into snapcompact.compact() in both the auto-compaction and manual /compact paths. When the kept-recent slice already exceeds the budget, snapcompact is skipped outright instead of running just to be rejected: the projection guard remains as a defensive check. Fixes #3247

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5cce507582

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-22T10:09:37Z

+		// past the projection check below.
+		const SUMMARY_TEXT_RESERVE = 4000;
+		const frameBudget = ctxWindow - reserve - nonFrameTokens - SUMMARY_TEXT_RESERVE;
+		if (frameBudget < snapcompact.FRAME_TOKEN_ESTIMATE) return 0;


Allow zero-frame snapcompact archives when they fit

When the kept-recent slice leaves less than one FRAME_TOKEN_ESTIMATE of headroom but still enough room for a small text archive, this returns 0 before calling snapcompact. snapcompact.compact() can produce a valid text-only archive (frames: []) when the discarded history fits in its text edge regions, so /compact snapcompact and auto snapcompact now unnecessarily fall back to the LLM summarizer in that near-full-context case, which can fail for offline/no-compaction-credential sessions even though local snapcompact would have fit and the existing projection guard would have accepted it.

Useful? React with 👍 / 👎.

chatgpt-codex review on #3249: the helper returned 0 when frameBudget < FRAME_TOKEN_ESTIMATE, causing the caller to skip snapcompact entirely. But snapcompact.planArchive has a 'text.length <= 2 * edgeCap' short- circuit that produces a valid frames:[] archive when the discarded history is small enough — and the projection charges 0 for that. Hard return-0 blocked that opportunity, forcing the LLM summarizer fallback in offline/no-credential sessions where the text-only path would have landed cleanly. #computeSnapcompactMaxFrames now distinguishes two near-full cases: - frameBudget < 0 → return 0 (kept-recent already exhausted budget; no text-only summary can fit either) → caller still skips outright. - 0 ≤ frameBudget < FRAME_TOKEN_ESTIMATE → return 1 → snapcompact runs and picks the frame-less planArchive branch automatically for small discarded histories; the projection guard rejects any actual frame-bearing archive that overflows. Added regression test pinning maxFrames=1 (not 0) in the near-full window case.

roboomp · 2026-06-22T10:16:54Z

Addressed in 65f945f.

#computeSnapcompactMaxFrames now distinguishes frameBudget < 0 (truly impossible — kept-recent + non-message + summary reserve > budget → return 0 → caller skips) from 0 ≤ frameBudget < FRAME_TOKEN_ESTIMATE (frame won't fit but text-only might → return 1 → snapcompact runs and planArchive's text.length <= 2 * edgeCap short-circuit can still produce a valid frames: [] archive that the projection accepts).
Caller skip path is unchanged, just triggers less often. Projection guard still rejects any actual frame-bearing archive that overflows, so offline sessions can land snapcompact text-only instead of being forced into the LLM summarizer.
Regression test still invokes snapcompact with maxFrames=1 when the budget can only fit a text-only archive pins the new contract (kept-recent tuned so frameBudget lands in [0, FRAME_TOKEN_ESTIMATE), snapcompact still invoked with maxFrames = 1).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 65f945f1b7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-22T10:20:19Z

+		const SUMMARY_TEXT_RESERVE = 4000;
+		const frameBudget = ctxWindow - reserve - nonFrameTokens - SUMMARY_TEXT_RESERVE;
+		if (frameBudget < 0) return 0;


Allow text-only snapcompact below the fixed reserve

When kept-recent plus non-message tokens still leaves some real headroom but less than this fixed 4k reserve, frameBudget becomes negative and the callers skip snapcompact entirely. Fresh evidence in this revision is the new SUMMARY_TEXT_RESERVE subtraction before the return 0: a small text-only snapcompact archive can project under the window with far less than 4000 tokens, but this path now forces the LLM summarizer/offline failure instead of attempting the local zero-frame result.

Useful? React with 👍 / 👎.

chatgpt-codex second-pass review on #3249: the previous helper folded the 4k SUMMARY_TEXT_RESERVE into both the maxFrames cap math AND the skip decision (return 0 when frameBudget < 0). That made any residual headroom below 4k fall negative and force the LLM-summarizer fallback, even though a text-only snapcompact archive (the 'text.length <= 2 * edgeCap' short-circuit in planArchive) typically costs only a few hundred tokens of summary lead-in and would have fit cleanly. The two reserves now serve their own jobs: - Skip iff 'baseTokens >= totalBudget' (kept-recent + non-message already eats the entire window − reserve envelope). No reserve fudge here; positive residual is always worth attempting. - Cap reserve (4k) is applied ONLY to the maxFrames calculation so the projection still passes once frames land. When the frame budget goes negative under that reserve but residual headroom is positive, the helper now returns maxFrames=1 instead of 0 so snapcompact's frame-less planArchive branch can still produce a valid archive. Updated regression test to pin the new contract directly: kept-recent tuned for 1500 tokens of headroom (well below the 4k cap reserve), the old helper returned 0 and skipped to the LLM summarizer, the new helper invokes snapcompact with maxFrames=1.

roboomp · 2026-06-22T10:25:02Z

Addressed in db57efc.

Skip decision now uses raw baseTokens >= totalBudget (no reserve fudge): snapcompact is skipped only when kept-recent + non-message already exhausts the ctxWindow − reserve envelope.
4k SUMMARY_TEXT_RESERVE is applied only to the maxFrames cap calculation, never to the skip — so residual headroom of 1–4k tokens now lands snapcompact (returning maxFrames=1) instead of forcing the LLM summarizer.
Regression test retuned to headroomTokens = 1500 (the exact sub-reserve case the review flags) so the test fails against the previous helper revision (confirmed: prior code falls through to the real LLM call and 404s) and passes against the split-reserve fix.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: db57efc3d3

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-22T10:28:06Z

+		const SUMMARY_TEXT_RESERVE = 4000;
+		const frameBudget = totalBudget - baseTokens - SUMMARY_TEXT_RESERVE;
+		if (frameBudget < snapcompact.FRAME_TOKEN_ESTIMATE) return 1;
+		return Math.min(Math.floor(frameBudget / snapcompact.FRAME_TOKEN_ESTIMATE), snapcompact.MAX_FRAMES_DEFAULT);


Reserve actual text-edge cost before capping frames

When the residual budget is large enough to permit frames, this fixed 4k reserve can still choose a maxFrames value that the projection immediately rejects. For example, with ~120k tokens of headroom on the default Anthropic/Sonnet shape, this returns 23 frames, but any archive large enough to use frames also carries both textHead and textTail; those two edge regions are about 27.8k bytes, which estimateTokens counts as roughly 7k tokens before the snapcompact summary/delimiters. The resulting projection is base + 23 * 5024 + >7k, which exceeds the same headroom and falls back to the LLM summary/repeats the warning loop this cap is intended to prevent. Size the cap using the actual summary/text-edge estimate, or decrement until #projectSnapcompactContextTokens would fit.

Useful? React with 👍 / 👎.

…cost chatgpt-codex third-pass review on #3249: the 4k SUMMARY_TEXT_RESERVE in the cap math undersized the actual textHead+textTail cost a frame- bearing archive carries (the projection separately bills 'countTokens(summary + textHead + textTail)'). At ~120k headroom on Anthropic 11on16-bw, the cap picked maxFrames=23, but '23 * 5024 + 2 * 13916 chars (≈7k tokens) + 2k summary template ≈ 124.5k' still exceeded the same 120k headroom — the cap chose a value the projection then immediately rejected, re-opening the warning loop. #computeSnapcompactMaxFrames now resolves the live snapcompact shape (same call the auto/manual paths pass to snapcompact.compact) and sizes the cap reserve from 'geometry(shape).capacity': textEdgeTokens = ceil(2 * capacity * 1.15 / 4) // 1.15 absorbs // tokenizer drift capReserve = textEdgeTokens + 2000 // + summary template For the default per-provider winners that resolves to ~10k (Anthropic Sonnet), ~14k (Opus 4.7), ~16k (Gemini 2.x), and ~10k (OpenAI) — all larger than the prior fixed 4k. Skip decision stays separate (baseTokens >= totalBudget), so positive sub-reserve headroom still runs snapcompact's text-only path. Test 1 retuned to baseline kept-recent ≈ 100k tokens with a strengthened assertion verifying the FULL projection invariant (frames + worst-case text edges + summary template + base ≤ budget). Confirmed test fails against the previous 4k-reserve helper by exactly the reviewer's predicted margin (174,271 vs 170,000 budget = 4,271 token overshoot).

roboomp · 2026-06-22T10:38:59Z

Addressed in 2329944.

Cap reserve is now sized from the live snapcompact shape's geometry(shape).capacity — textEdgeTokens = ⌈2 × capacity × 1.15 / 4⌉ plus a 2k summary-template allowance — mirroring what #projectSnapcompactContextTokens will actually charge once frames land (≈10k Anthropic Sonnet / OpenAI, ≈14k Opus 4.7, ≈16k Gemini, all up from the prior 4k).
Resolved via the same snapcompact.resolveShape(model, settings) the auto/manual paths pass to snapcompact.compact(), so the cap and the projection agree on the shape by construction.
Skip decision stays decoupled (baseTokens >= totalBudget), so positive sub-reserve headroom still runs the text-only path.
Regression test 1 retuned: kept-recent ≈ 100k tokens, assertion strengthened to verify the FULL projection (base + frames × FRAME_TOKEN_ESTIMATE + worst-case text edges + summary template ≤ budget). Confirmed failing against the prior 4k helper by exactly the reviewer's predicted margin (174,271 vs 170,000 budget = 4,271-token overshoot).

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 232994496d

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-06-22T10:43:38Z

+			}
+		});
+
+		await expect(session.compact(undefined, { mode: "snapcompact" })).rejects.toThrow();


Stub the LLM fallback in the skip test

When this assertion exercises the maxFrames < 1 path, AgentSession.compact() does not stop after skipping snapcompact; it falls through to the normal LLM summarizer. Because the test installs a fake Anthropic runtime key in beforeEach and only spies on snapcompact.compact, this line can drive real provider compaction attempts with test-key before eventually rejecting, making the suite depend on network/auth behavior and potentially slow or flaky in CI. Stub the agent-core compaction path/model resolver here, or assert the skip through a seam that does not enter provider code.

Useful? React with 👍 / 👎.

roboomp mentioned this pull request Jun 22, 2026

[BUG] auto snapcompact long time run fail #3247

Open

github-actions Bot added the vouched Passed the vouch gate label Jun 22, 2026

chatgpt-codex-connector Bot reviewed Jun 22, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(agent): size snapcompact maxFrames by the live model window#3249

fix(agent): size snapcompact maxFrames by the live model window#3249
roboomp wants to merge 4 commits into
mainfrom
farm/6396f3c3/fix-snapcompact-frame-budget-overflow

roboomp commented Jun 22, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 22, 2026

Uh oh!

roboomp commented Jun 22, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 22, 2026

Uh oh!

roboomp commented Jun 22, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 22, 2026

Uh oh!

roboomp commented Jun 22, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

roboomp commented Jun 22, 2026

Repro

Cause

Fix

Verification

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

roboomp commented Jun 22, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

roboomp commented Jun 22, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

roboomp commented Jun 22, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 22, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant