Skip to content

Python: surface Bedrock cache token counts in usage details#6640

Open
he-yufeng wants to merge 5 commits into
microsoft:mainfrom
he-yufeng:fix/bedrock-usage-cache-tokens
Open

Python: surface Bedrock cache token counts in usage details#6640
he-yufeng wants to merge 5 commits into
microsoft:mainfrom
he-yufeng:fix/bedrock-usage-cache-tokens

Conversation

@he-yufeng

Copy link
Copy Markdown
Contributor

Motivation & Context

The Bedrock chat client only surfaces input, output, and total token counts in usage_details. When prompt caching is active, the Bedrock Converse API also reports cacheReadInputTokens (input tokens served from a cache) and cacheWriteInputTokens (input tokens written to a cache), and _parse_usage drops both. So cache usage silently reads as zero for cached prompts, which throws off cost and token accounting.

UsageDetails already defines canonical fields for these (cache_read_input_token_count, cache_creation_input_token_count), and the OpenAI and Anthropic connectors already populate them, so Bedrock was the odd one out.

Description & Review Guide

  • What are the major changes? BedrockChatClient._parse_usage now maps cacheReadInputTokens to cache_read_input_token_count and cacheWriteInputTokens to cache_creation_input_token_count, following the same is not None guard pattern as the existing three fields.
  • What is the impact of these changes? Cache-read and cache-write token counts are now reported for Bedrock, consistent with the OpenAI and Anthropic connectors. Responses without prompt caching are unchanged (the fields stay unset).
  • What do you want reviewers to focus on? That the Converse field names (cacheReadInputTokens / cacheWriteInputTokens) are mapped to the right UsageDetails keys.

Added test_parse_usage_surfaces_cache_tokens and ran the Bedrock test suite locally (8 passed). This mirrors #6638, which did the same for the Gemini connector.

Related Issue

Fixes #6639

Contribution Checklist

  • The code builds clean without any errors or warnings
  • All unit tests pass, and I have added new tests where possible
  • The PR follows the Contribution Guidelines
  • This PR is linked to an issue and there is no other open PR for this issue.
  • This is not a breaking change.

Copilot AI review requested due to automatic review settings June 20, 2026 02:59
@moonbox3 moonbox3 added the python Issues related to the Python codebase label Jun 20, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Surfaces Bedrock Converse prompt-cache token counts in the framework’s canonical usage_details so cached prompts don’t report zero cache usage, aligning Bedrock behavior with other connectors that already populate these fields.

Changes:

  • Map Bedrock Converse cacheReadInputTokenscache_read_input_token_count and cacheWriteInputTokenscache_creation_input_token_count in BedrockChatClient._parse_usage.
  • Add a Bedrock unit test asserting the cache token counts are surfaced.
  • Also includes Gemini usage parsing + tests for cached/thinking token counts (note: this expands scope beyond the PR title/description).

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
python/packages/bedrock/agent_framework_bedrock/_chat_client.py Map Bedrock Converse cache token fields into canonical UsageDetails keys.
python/packages/bedrock/tests/test_bedrock_client.py Add unit test covering Bedrock cache token parsing.
python/packages/gemini/agent_framework_gemini/_chat_client.py Add parsing of Gemini cached/thinking token counts into canonical usage fields.
python/packages/gemini/tests/test_gemini_client.py Extend test helpers and add a test for Gemini cached/reasoning usage fields.

details["cache_read_input_token_count"] = cache_read
if (cache_write := usage.get("cacheWriteInputTokens")) is not None:
details["cache_creation_input_token_count"] = cache_write
return details
Comment on lines +1054 to +1057
if (v := usage.cached_content_token_count) is not None:
details["cache_read_input_token_count"] = v
if (v := usage.thoughts_token_count) is not None:
details["reasoning_output_token_count"] = v

@eavanvalkenburg eavanvalkenburg left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the comment from copilot

@github-actions

Copy link
Copy Markdown
Contributor

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/bedrock/agent_framework_bedrock
   _chat_client.py4499878%304–305, 321–330, 336, 404, 413, 424, 426, 428, 433, 452–453, 477, 490, 502, 505, 513–514, 517–518, 520–521, 526–528, 530, 540–541, 563, 570, 579–580, 582–583, 585–587, 589, 591–592, 598–600, 603–604, 610–613, 619–629, 632, 651, 656, 706–707, 720, 746, 758, 763, 791, 795–796, 799, 817, 841, 853, 857, 871, 879–880, 884, 886–893
packages/gemini/agent_framework_gemini
   _chat_client.py380399%398, 821, 832
TOTAL40636462688% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
8098 34 💤 0 ❌ 0 🔥 1m 59s ⏱️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

python Issues related to the Python codebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: Bedrock connector drops cache token counts from usage details

4 participants