fix(catalog): restore Umans GLM-5.2 max reasoning#3193
Conversation
Mapped Umans GLM-5.2's upstream max reasoning level to the internal xhigh effort and preserved the max wire value in dynamic discovery and bundled catalog metadata. Added resolver coverage for the high/max ladder and verified the xhigh request maps back to max. Fixes #3192
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 3a888c0f59
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| const efforts = mapUmansReasoningEfforts(value); | ||
| const thinking: ThinkingConfig = { mode: "budget", efforts }; | ||
| if (umansHasMaxReasoningLevel(value) && efforts.includes(Effort.XHigh)) { | ||
| thinking.effortMap = UMANS_MAX_REASONING_EFFORT_MAP; |
There was a problem hiding this comment.
Emit the Umans max tier on requests
When Umans discovery reports levels: ["none", "high", "max"], this stores the xhigh -> max map on a mode: "budget" model, but Anthropic request shaping only turns effortMap into output_config.effort for anthropic-adaptive or anthropic-budget-effort modes (streamAnthropic otherwise just sends thinking.budget_tokens). As a result, selecting xhigh for dynamically discovered Umans GLM-5.2 still never sends the provider wire value "max"; the same issue applies to the bundled entry generated from inferDetectedEffortMap.
Useful? React with 👍 / 👎.
Removed the xhigh -> max effortMap baked onto the Umans GLM-5.2 spec; mode="budget" routes thinking depth via thinking.budget_tokens and never consults effortMap, so the wire shape is unchanged. The picker still surfaces high and xhigh via getModelDefinedEfforts, and the dynamic discovery still maps the upstream max level to Effort.XHigh. Refs #3192
|
Fair catch — |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 1e01d57d09
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| medium: Effort.Medium, | ||
| high: Effort.High, | ||
| xhigh: Effort.XHigh, | ||
| max: Effort.XHigh, |
There was a problem hiding this comment.
Emit Umans max effort on the wire
When Umans reports levels: ["none", "high", "max"], this maps max to the UI's xhigh, but the resulting model still uses thinking.mode: "budget" and the new test asserts there is no effortMap. Fresh evidence in this revision is that xhigh is now surfaced while effortMap is deliberately absent; for anthropic-messages budget mode, mapOptionsForApi only turns xhigh into ANTHROPIC_THINKING.xhigh budget tokens, and buildParams only emits output_config.effort for anthropic-adaptive or anthropic-budget-effort, so selecting the newly visible tier still never sends Umans the advertised wire level "max".
Useful? React with 👍 / 👎.
Forward anthropic-budget-effort selections through mapOptionsForApi so buildParams serializes output_config.effort with budget-token thinking. Mark Umans GLM-5.2 as budget-effort and map the UI xhigh tier back to Umans's max wire value. Refs #3192
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e277f3a310
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
|
|
||
| ### Fixed | ||
|
|
||
| - Fixed Anthropic-compatible budget-effort models dropping the selected effort before request serialization, so `output_config.effort` is emitted alongside `thinking.budget_tokens` when model metadata declares `mode: "anthropic-budget-effort"`. |
There was a problem hiding this comment.
Move AI changelog entry to Unreleased
/workspace/oh-my-pi/AGENTS.md says new changelog entries must go under ## [Unreleased] and released sections are immutable; this adds the new AI-package fix under the already-released 16.1.10 section. In the next release this note can be missed while also rewriting historical release notes, so please move it into an Unreleased ### Fixed section.
Useful? React with 👍 / 👎.
| @@ -1,4 +1,5 @@ | |||
| import { describe, expect, it } from "bun:test"; | |||
| import { Effort } from "@oh-my-pi/pi-catalog"; | |||
There was a problem hiding this comment.
Import Effort from its catalog submodule
/workspace/oh-my-pi/AGENTS.md requires catalog values to come from @oh-my-pi/pi-catalog/<module>; this root-barrel import evaluates src/index.ts and pulls unrelated catalog modules just to access Effort. In package-local tests that adds avoidable dependencies/cycles before the Umans provider code runs, so use @oh-my-pi/pi-catalog/effort instead.
Useful? React with 👍 / 👎.
Updated the Umans provider regression test to import Effort from the catalog effort submodule instead of the root catalog barrel, keeping the package-local test on the narrow dependency path. Refs #3192
|
|
I really don't want to do this. Simple policy atm is xhi == maximum. |
Moved the AI package changelog entry for Anthropic budget-effort serialization into the Unreleased section so the already-released 16.1.10 notes stay immutable. Refs #3192
|
Repro
Feeding Umans discovery metadata for
umans-glm-5.2withlevels: ["none","high","max"]throughumansModelManagerOptions()reproduced the regression: before the fix the model thinking metadata was{"mode":"budget","efforts":["high"],"defaultLevel":"high"}, so the picker could only show the survivinghightier plus global selectors.Cause
packages/catalog/src/provider-models/openai-compat.tsmapped Umans reasoning levels throughUMANS_REASONING_EFFORT_BY_LEVEL, which did not recognize the upstream"max"level, and Umans GLM-5.2 had no bakedthinking.effortMapto sendEffort.XHighas the provider wire value"max".packages/catalog/src/model-thinking.tsalso re-baked bundled catalog thinking from model identity without the Umans GLM-5.2 high/max constraint, so regenerated fallback metadata lost the provider-specific wire map.Fix
"max"toEffort.XHighand attach{ xhigh: "max" }when dynamic discovery reports a max level.high/xhighand uses thexhigh -> maxwire mapping, keeping regeneratedmodels.jsonconsistent.Verification
Ran
bun --cwd=packages/catalog test test/umans-provider.test.ts,bun --cwd=packages/catalog test test/model-thinking.test.ts test/generated-policies.test.ts,bun --cwd=packages/catalog test test/umans-provider.test.ts test/model-thinking.test.ts test/generated-policies.test.ts && bun --cwd=packages/catalog run check:types, and a one-off repro command verifying the fixed output is{"thinking":{"mode":"budget","efforts":["high","xhigh"],"effortMap":{"xhigh":"max"},"defaultLevel":"high"},"xhighWire":"max"}. Fixes #3192