feat: Cloudflare-native AI tracing (agents/observability + agents/observability/ai) by mattzcarey · Pull Request #1860 · cloudflare/agents

mattzcarey · 2026-07-02T14:14:20Z

Cloudflare-native tracing for AI agents: spans built on the Workers runtime's tracing API (cloudflare:workers), named and attributed to the OpenTelemetry GenAI semantic conventions, flowing to Workers Observability with zero dependencies. No OTel SDK, no exporter, no collector — traces show up in the dash next to your fetch/DO/KV spans.

Instrumenting the AI SDK

Enable traces on the Worker:

// wrangler.jsonc
{
  "observability": { "traces": { "enabled": true } }
}

AI SDK v6 — wrap the namespace once, use it as normal:

import * as ai from "ai";
import { wrapAISDK } from "agents/observability/ai";

const { generateText, streamText } = wrapAISDK(ai);

await streamText({
  model,
  prompt: "book a table for two",
  tools: { searchRestaurants, reserve },
  experimental_telemetry: {
    functionId: "booking-agent", // becomes gen_ai.agent.name
    metadata: { conversationId: "conv-42" }
  }
});

Every call produces one semconv-shaped trace:

invoke_agent booking-agent            gen_ai.operation.name=invoke_agent, tokens, finish reason
├── chat gpt-4o                       per doGenerate/doStream: model, params, usage, time_to_first_chunk
├── execute_tool searchRestaurants    gen_ai.tool.name, gen_ai.tool.call.id, real duration
└── chat gpt-4o

AI SDK v7 — register the telemetry lifecycle adapter instead of wrapping:

import { registerTelemetry } from "ai";
import { createAISDKTelemetry } from "agents/observability/ai";

registerTelemetry(createAISDKTelemetry());

Same spans, driven by the SDK's telemetry callbacks, correlated by cloudflare.agents.call.id / gen_ai.tool.call.id.

Stream spans stay open until the stream is consumed, cancelled, errors, or is returned early — an aborted stream closes as canceled, not as a false success. Streaming tools (async-generator execute) keep their span open until the iterable is drained (bodies run inside the tool span's async context; early termination still runs the generator's own cleanup), so tool durations are real. Untraced invocations take a pristine fast path: the original operation gets the original params — no tool wrapping, no model middleware, no stream patching.

Think: traced out of the box

Think agents emit this exact trace tree per turn with zero configuration and zero new API surface. The turn's streamText call is the invoke_agent root span — named after the agent class, carrying agent/conversation identity plus turn attributes (cloudflare.agents.turn.request_id, .trigger, .admission, .channel, .continuation, .generation) — with inference and tool calls as its only children. No opt-in flag, no setup, nothing exported; on runtimes without the tracing API the tracer is a no-op.

invoke_agent SupportAgent              turn identity + request params + aggregated usage
├── chat gpt-4o                        per inference step
├── execute_tool lookupOrder           gen_ai.tool.call.id, real duration
└── chat gpt-4o

How it works internally: Think merges its identity and current-turn metadata into experimental_telemetry.metadata at the call site (caller-provided metadata wins, and still flows to the AI SDK's own telemetry when enabled), and the wrapper projects those onto root-span attributes — reserved keys to cloudflare.agents.turn.*, userId to semconv user.id, any other scalar to cloudflare.agents.metadata.{key}. Drain loops also finalize the underlying model stream on early exit (in-stream error, stall abort, user abort) so operation spans close instead of leaking — the SDK tees its base stream, and an abandoned tee branch would otherwise leave the span open forever.

Schema

Span names follow the semconv formula with a bare-operation fallback past 64 UTF-8 bytes. Query on gen_ai.operation.name, never the span name.

Span	`gen_ai.operation.name`	Carries
`invoke_agent {agent}`	`invoke_agent`	agent/conversation identity, request params, aggregated usage (incl. cache + reasoning tokens), finish reasons
`chat {model}`	`chat`	per-model-call params, usage, `gen_ai.response.id/model`, `gen_ai.response.time_to_first_chunk`
`execute_tool {tool}`	`execute_tool`	`gen_ai.tool.name`, `gen_ai.tool.call.id`, real execution duration

Semconv keys (gen_ai.*) wherever a home exists; vendor extensions under cloudflare.agents.* — never bare keys, never ai.* (that's the Vercel AI SDK's namespace; squatting it would fake compatibility we don't have).
Failures: otel.status_code: "ERROR" + error.type (the spec-defined status encoding for status-less backends). Cancellations: cloudflare.agents.canceled: true, status untouched — aborts are not errors.
Scalar-only, content-free: no prompts, messages, tool inputs/outputs, schemas, or raw error messages, ever. Semconv content capture is opt-in and stays permanently off here.

Public surface (kept deliberately small)

agents/observability: tracer + types AgentTracer / AgentSpan / TraceAttributes / TraceAttributeValue. agents/observability/ai: wrapAISDK, createAISDKTelemetry. Everything else — span builders, attribute constants, the SpanRuntime seam — is private, so when the runtime gains native OTel support we can converge behind the facade without a breaking change (names were chosen to avoid @opentelemetry/api collisions; our openSpan is not OTel's startSpan, which creates without activating).

Provenance

Ported from @msmps's feat/ai-tracing branch — commits carry Co-authored-by credit — then folded into the agents package and aligned with the GenAI semantic conventions.

Testing

54 observability tests (tracer, v6 wrapper, v7 adapter): span names + fallback, abort-chunk cancellation, streaming-tool span lifetime, tool call ids, time-to-first-chunk, metadata→attribute passthrough (reserved keys, user.id, scalar passthrough, object dropping, identity consumption)
agents workers project: 78 files / 1539 tests green
think: workers project 42 files / 894 tests green with instrumentation live, plus generated-entry/vite/cli/react projects green
npm run check green across all 114 projects

Co-authored-by: msmps <7691252+msmps@users.noreply.github.com>

- match workspace devDependency versions (sherif) - oxfmt formatting, remove unused type imports (oxlint) - bundler moduleResolution with extensionless relative imports - build with tsdown like sibling packages (cloudflare:workers kept external) - explicit types field for TS 6 (no automatic @types inclusion) - start at version 0.0.0 with an initial-release changeset - update pnpm lockfile

changeset-bot · 2026-07-02T14:14:29Z

🦋 Changeset detected

Latest commit: 2fcf743

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 2 packages

Name	Type
agents	Minor
@cloudflare/think	Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

pkg-pr-new · 2026-07-02T14:25:26Z

Open in StackBlitz

agents

npm i https://pkg.pr.new/agents@1860

@cloudflare/ai-chat

npm i https://pkg.pr.new/@cloudflare/ai-chat@1860

@cloudflare/codemode

npm i https://pkg.pr.new/@cloudflare/codemode@1860

create-think

npm i https://pkg.pr.new/create-think@1860

hono-agents

npm i https://pkg.pr.new/hono-agents@1860

@cloudflare/shell

npm i https://pkg.pr.new/@cloudflare/shell@1860

@cloudflare/think

npm i https://pkg.pr.new/@cloudflare/think@1860

@cloudflare/voice

npm i https://pkg.pr.new/@cloudflare/voice@1860

@cloudflare/worker-bundler

npm i https://pkg.pr.new/@cloudflare/worker-bundler@1860

commit: 2fcf743

Move the ai-tracing package into the agents package: the tracer core (createTracer, the cloudflare:workers-bound tracer, span types) is exported from agents/observability and the AI SDK v6/v7 adapters from the new agents/observability/ai entry. The cloudflare:workers 'tracing' export is accessed via the module namespace with a no-op fallback so runtimes that predate it degrade gracefully instead of failing at module-link time (the observability module loads with the main agents entry). The hand-rolled cloudflare:workers type shim is dropped in favor of @cloudflare/workers-types. Tests run in the agents workers pool. Co-authored-by: msmps <7691252+msmps@users.noreply.github.com>

…face Schema (per semconv research; nothing shipped, renames free): - span names follow the semconv formula with a 64-byte bare-op fallback: 'invoke_agent {agent}', 'chat {model}', 'execute_tool {tool}' — the stable query key is gen_ai.operation.name, never the span name - vendor keys move to cloudflare.agents.* (ai.* is the Vercel AI SDK's de-facto namespace); ai.tool.call_id becomes semconv gen_ai.tool.call.id - failures record otel.status_code: ERROR + error.type (the spec-defined status encoding for status-less backends) instead of a bare error boolean; cancellations record cloudflare.agents.canceled and are not errors - gen_ai.provider.name normalized to the semconv enum; gen_ai.request.stream emitted only when true; gen_ai.response.time_to_first_chunk and response id/model captured on the stream path Wrapper fixes surfaced by the trace-content audit: - AI SDK v6 signals aborts as in-band {type:'abort'} chunks and never rejects with AbortError — recognize them so aborted streams close as canceled instead of false successes - streaming tools (async-generator execute) keep their execute_tool span open until the iterable is consumed instead of finishing at ~0ms - tool spans carry gen_ai.tool.call.id from the execute options Public surface hardening (runtime will gain native OTel support later): - types renamed to avoid @opentelemetry/api collisions: AgentTracer, AgentSpan, TraceAttributes, TraceAttributeValue; startSpan renamed openSpan (OTel's startSpan means create-without-activating — a semantic inversion) - createTracer, SpanRuntime, SpanWriter, MaybePromise are private: SpanRuntime is the OTel-convergence seam and must stay free to change

Zero new public surface. Think's streamText call routes through the always-on agents/observability/ai wrapper, so every turn emits an 'invoke_agent {agent class}' root span with 'chat {model}' and 'execute_tool {tool}' children in Workers Observability. - the admittedTurnContext ALS internally carries trigger/admission/channel/ continuation/generation; _turnTelemetry() injects agent identity and turn metadata into experimental_telemetry.metadata (caller values win; inert for the AI SDK's own telemetry unless enabled) - agents adapters (v6 + v7) project telemetry metadata onto root-span attributes: reserved keys -> cloudflare.agents.turn.*, userId -> user.id, other scalars -> cloudflare.agents.metadata.{key}, objects dropped - drain loops finalize the underlying model stream on early exit (in-stream error break, stall abort, user abort) via a WeakMap finalizer calling consumeStream — the SDK tees its base stream, so an abandoned tee branch would otherwise leave the operation span open forever - wrapModel skips middleware for gateway-style string model ids (the root span still carries the model)

Verified against the pinned ai@6.0.208 and fixed: - stream observation now unwraps the SDK's {part} baseStream envelope — previously real spans missed usage, finish reasons, errors, and aborts (only look-alike test fixtures passed); added real-SDK integration tests (actual streamText + MockLanguageModelV3) covering envelope unwrapping, in-band error/abort parts, tool call ids, and time-to-first-chunk - removed the eager result-getter 'safeguard': steps/totalUsage/finishReason getters call consumeStream(), so touching them started hidden stream consumption at wrap time; added a laziness regression test - untraced fast path: when an invocation is not traced the wrapper calls the original operation with the original params — no tool wrapping, no model middleware, no stream patching (AgentSpan gains readonly isTraced) - main agents entry no longer initializes tracing: diagnostics-channel events moved to observability/events.ts; the public barrel composes events+tracing - provider doStream now runs inside the chat span's activation so provider work nests under it; stream patching fails open on unknown result shapes - extractors read the public result shapes (inputTokenDetails/ outputTokenDetails, response.modelId, deprecated flat fields) and string gateway model ids - think: agents peer floor raised to >=0.18.0; the early-exit stream drain is idempotent (deleted before invocation) and rides ctx.waitUntil - v7 tool spans keyed by callId:toolCallId (concurrent id reuse); operation wrappers cached for stable identity; tracer attribute writes fail-safe; cloudflare.agents.operation.id renamed to .operation.name (values are names)

- untraced calls no longer compute the span spec: roots open with only the semconv name (agent name via direct property reads) and empty attributes; the full spec — metadata enumeration, request fields, context allowlists — is computed after the isTraced check and written through an internal writeSpanAttributes seam, so caller getters/proxies are never enumerated on untraced calls - think drains the model stream only on early exits (break or throw), via a natural-exhaustion flag — consumeStream is not a no-op (it tees baseStream and traverses the buffered branch), so draining every call was per-inference overhead; a thrown exit (stall watchdog) still drains - the finalizer runs exactly once: the drain promise is created before ctx.waitUntil, so a missing/throwing waitUntil cannot start a second tee consumer - async-generator tool bodies are re-entered into the tool span's async context via AsyncLocalStorage.snapshot() on every pull, so spans created inside the body parent under execute_tool (verified in workerd) - extractors: provider response-metadata stream parts populate response id/model on chat spans; v7 reads public usage detail shapes (inputTokenDetails/outputTokenDetails + deprecated flat fields) and prefers the served response.modelId over the requested event.modelId

The round-2 manual iterator.next() loop dropped for-await's automatic return() forwarding: a consumer breaking while the wrapper was suspended at yield closed the span but never ran the tool generator's own finally blocks. The wrapper now tracks exhaustion and, on early termination, forwards iterator.return() inside the tool span's context before finishing the span. Regression test: consumer breaks after the first yield; the tool generator's cleanup runs (and a span opened in that cleanup parents under execute_tool).

mattzcarey and others added 3 commits July 2, 2026 15:00

feat: initial pass of cloudflare-native ai tracing

bd6f59a

Co-authored-by: msmps <7691252+msmps@users.noreply.github.com>

feat: add ai sdk v7 telemetry support to ai-tracing

006ac94

Co-authored-by: msmps <7691252+msmps@users.noreply.github.com>

mattzcarey changed the title ~~feat: add @cloudflare/ai-tracing — Cloudflare-native tracing for the AI SDK~~ feat: Cloudflare-native AI tracing via agents/observability Jul 2, 2026

mattzcarey changed the title ~~feat: Cloudflare-native AI tracing via agents/observability~~ feat: Cloudflare-native AI tracing (agents/observability + agents/observability/ai) Jul 3, 2026

mattzcarey added 4 commits July 3, 2026 14:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Cloudflare-native AI tracing (agents/observability + agents/observability/ai)#1860

feat: Cloudflare-native AI tracing (agents/observability + agents/observability/ai)#1860
mattzcarey wants to merge 9 commits into
mainfrom
feat/agent-tracing

mattzcarey commented Jul 2, 2026 •

edited

Loading

Uh oh!

changeset-bot Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

pkg-pr-new Bot commented Jul 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

mattzcarey commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Instrumenting the AI SDK

Think: traced out of the box

Schema

Public surface (kept deliberately small)

Provenance

Testing

Uh oh!

changeset-bot Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🦋 Changeset detected

Uh oh!

pkg-pr-new Bot commented Jul 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mattzcarey commented Jul 2, 2026 •

edited

Loading

changeset-bot Bot commented Jul 2, 2026 •

edited

Loading

pkg-pr-new Bot commented Jul 2, 2026 •

edited

Loading