Python: surface Bedrock cache token counts in usage details#6640
Open
he-yufeng wants to merge 5 commits into
Open
Python: surface Bedrock cache token counts in usage details#6640he-yufeng wants to merge 5 commits into
he-yufeng wants to merge 5 commits into
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Surfaces Bedrock Converse prompt-cache token counts in the framework’s canonical usage_details so cached prompts don’t report zero cache usage, aligning Bedrock behavior with other connectors that already populate these fields.
Changes:
- Map Bedrock Converse
cacheReadInputTokens→cache_read_input_token_countandcacheWriteInputTokens→cache_creation_input_token_countinBedrockChatClient._parse_usage. - Add a Bedrock unit test asserting the cache token counts are surfaced.
- Also includes Gemini usage parsing + tests for cached/thinking token counts (note: this expands scope beyond the PR title/description).
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| python/packages/bedrock/agent_framework_bedrock/_chat_client.py | Map Bedrock Converse cache token fields into canonical UsageDetails keys. |
| python/packages/bedrock/tests/test_bedrock_client.py | Add unit test covering Bedrock cache token parsing. |
| python/packages/gemini/agent_framework_gemini/_chat_client.py | Add parsing of Gemini cached/thinking token counts into canonical usage fields. |
| python/packages/gemini/tests/test_gemini_client.py | Extend test helpers and add a test for Gemini cached/reasoning usage fields. |
| details["cache_read_input_token_count"] = cache_read | ||
| if (cache_write := usage.get("cacheWriteInputTokens")) is not None: | ||
| details["cache_creation_input_token_count"] = cache_write | ||
| return details |
Comment on lines
+1054
to
+1057
| if (v := usage.cached_content_token_count) is not None: | ||
| details["cache_read_input_token_count"] = v | ||
| if (v := usage.thoughts_token_count) is not None: | ||
| details["reasoning_output_token_count"] = v |
eavanvalkenburg
approved these changes
Jun 22, 2026
eavanvalkenburg
left a comment
Member
There was a problem hiding this comment.
Please fix the comment from copilot
Contributor
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||||||||||||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation & Context
The Bedrock chat client only surfaces input, output, and total token counts in
usage_details. When prompt caching is active, the Bedrock Converse API also reportscacheReadInputTokens(input tokens served from a cache) andcacheWriteInputTokens(input tokens written to a cache), and_parse_usagedrops both. So cache usage silently reads as zero for cached prompts, which throws off cost and token accounting.UsageDetailsalready defines canonical fields for these (cache_read_input_token_count,cache_creation_input_token_count), and the OpenAI and Anthropic connectors already populate them, so Bedrock was the odd one out.Description & Review Guide
BedrockChatClient._parse_usagenow mapscacheReadInputTokenstocache_read_input_token_countandcacheWriteInputTokenstocache_creation_input_token_count, following the sameis not Noneguard pattern as the existing three fields.cacheReadInputTokens/cacheWriteInputTokens) are mapped to the rightUsageDetailskeys.Added
test_parse_usage_surfaces_cache_tokensand ran the Bedrock test suite locally (8 passed). This mirrors #6638, which did the same for the Gemini connector.Related Issue
Fixes #6639
Contribution Checklist