Python: Fix background agent telemetry context error#6764
Conversation
Python Test Coverage Report •
Python Unit Test Overview
|
||||||||||||||||||||||||||||||
There was a problem hiding this comment.
Pull request overview
Fixes a Python OpenTelemetry instrumentation bug where non-streaming agent invocations could raise ValueError: <Token ...> was created in a different Context when AgentTelemetryLayer.run() is called synchronously and the returned awaitable is later awaited in a different task/context (e.g., BackgroundAgentsProvider via asyncio.create_task(agent.run(...))).
Changes:
- Non-streaming agent path: move
INNER_RESPONSE_TELEMETRY_CAPTURED_FIELDS/INNER_ACCUMULATED_USAGEcontextvar.set()calls into the_run()coroutine so.set()and.reset()always occur in the same execution context. - Streaming agent path: keep eager contextvar
.set()where theResponseStreamis created, with added commentary about the context/token constraint. - Add a regression test that calls
run(stream=False)synchronously and awaits the returned coroutine insideasyncio.create_task(...).
Reviewed changes
Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| python/packages/core/agent_framework/observability.py | Adjusts where telemetry contextvars are set to avoid cross-context token resets in non-streaming invocations; adds explanatory comments. |
| python/packages/core/tests/core/test_observability.py | Adds a regression test covering awaiting run()’s returned awaitable in a separate task/context. |
| python/uv.lock | Updates locked dependency constraints/versions (not described in PR motivation/scope). |
There was a problem hiding this comment.
Automated Code Review
Reviewers: 5 | Confidence: 92%
✓ Correctness
The fix correctly moves contextvar
.set()calls into the_run()coroutine body for the non-streaming path, ensuring set and reset always execute in the same context. The streaming path correctly keeps eager setting since it's consumed synchronously in the caller's context. The regression test accurately mimics theBackgroundAgentsProviderpattern. The lock file changes are unrelated dependency version pins. No correctness issues found.
✓ Security Reliability
The fix correctly addresses the cross-context ValueError by moving contextvars.set() calls into the _run() coroutine body for the non-streaming path, ensuring set/reset always happen in the same execution context. The streaming path remains correctly eager since it's consumed synchronously in the same context. The uv.lock changes are dependency version pins unrelated to the core fix. No security or reliability issues found.
✓ Test Coverage
The regression test covers the primary scenario well — verifying that the non-streaming path's contextvar set/reset works correctly when the coroutine is awaited in a different context via
asyncio.create_task. The test assertions are meaningful (checksAgentResponsetype, span count, and non-error status). However, the error path (where the inner agent raises) in a cross-context scenario is not tested, even though thefinallyblock at line 1947-1949 performs the same token reset that caused the original bug.
✓ Failure Modes
The fix correctly moves contextvar .set() calls into the _run() coroutine body for the non-streaming path, ensuring set/reset pairs always execute in the same context. The streaming path appropriately keeps eager .set() since its cleanup runs synchronously in the same context. The regression test properly exercises the asyncio.create_task pattern used by BackgroundAgentsProvider. The uv.lock changes are unrelated dependency version pins. No failure modes introduced.
✓ Design Approach
The non-streaming fix is on the right track, but the streaming branch still relies on the same cross-context assumption that caused the original bug. Because
ResponseStreamcleanup hooks run in the context that consumes the stream, not the context that created it, eagerly setting the telemetryContextVars before returning the stream can still raise the sameToken was created in a different Contexterror when a stream is consumed from another task/context.
Automated review by westey-m's agents
There was a problem hiding this comment.
Automated Code Review
Reviewers: 5 | Confidence: 91%
✓ Correctness
The core fix is correct: moving the contextvars
.set()calls into the_run()coroutine body for the non-streaming path ensures set/reset always happen in the same execution context, fixing the cross-context ValueError whenasyncio.create_taskschedules the returned coroutine. The test properly reproduces the background-agent pattern. The streaming path concern is already covered by an existing unresolved review comment. No new correctness issues found.
✓ Security Reliability
The non-streaming fix is correct: moving the contextvar
.set()calls into the_run()coroutine body ensures set/reset always happen in the same execution context, resolving the cross-context ValueError for background agents. Thetry/finallyblock properly guarantees token cleanup on all paths. The streaming path concern is already covered by an existing unresolved review comment and is not restated here. No new security or reliability issues found in this change.
✓ Test Coverage
The regression test is well-crafted and directly exercises the exact failure scenario (non-streaming
run()called synchronously, then awaited viaasyncio.create_taskin a different context). Assertions are meaningful — checking successful completion, span count, and no error status. The fix itself correctly co-locates the contextvar.set()and.reset()inside the_run()coroutine body for the non-streaming path. One minor test coverage gap: no error-path variant of the cross-context test (agent raises inside_run()while awaited in a different context). The streaming cross-context concern is already flaged by an existing unresolved review comment, so I won't duplicate it.
✓ Failure Modes
The non-streaming path fix is correct: moving the contextvar .set() calls into the _run() coroutine body ensures set and reset always execute in the same context, fixing the cross-context ValueError for BackgroundAgentsProvider. The finally block at lines 1948-1949 properly resets both tokens. The streaming path concern (cleanup hooks running in the consumer's context rather than the token-creating context) is already captured by the existing unresolved review comment. No additional silent failures, lost errors, or operational failure modes are introduced by this change.
✓ Design Approach
I did not find any additional design-approach issues beyond the already-open streaming-context concern in the review thread. The non-streaming change matches the documented background-agent call path, and the new regression test covers the specific cross-context await pattern the PR is fixing.
Suggestions
- Consider adding an error-path variant of the cross-context regression test — e.g. an agent whose
run()returns a coroutine that raises, awaited viaasyncio.create_task. This would verify thefinallyblock (observability.py:1947-1949) properly resets the tokens even when the exception propagates across context boundaries. The mechanism should already work correctly since both.set()andfinally .reset()live inside_run(), but an explicit test would guard against future regressions.
Automated review by westey-m's agents
| { name = "ag-ui-protocol", specifier = ">=0.1.16,<0.2" }, | ||
| { name = "agent-framework-core", editable = "packages/core" }, | ||
| { name = "fastapi", specifier = ">=0.115.0,<0.138.1" }, | ||
| { name = "fastapi", specifier = ">=0.115.0,<0.133.1" }, |
There was a problem hiding this comment.
you're downgrading multiple packages, that can't be right
| # The contextvars are set lazily on the first pull, which only happens after this | ||
| # synchronous setup phase returns the stream, so the tokens are still unset here and | ||
| # there is nothing to reset. |
There was a problem hiding this comment.
nit: the comments here isn't associated with any actual code. I suspect these are copilot notes. Maybe we can remove.
Motivation & Context
Telemetry threw whenever an agent was invoked through the background-agents
provider. Running background agents (e.g. three concurrent tasks via
BackgroundAgentsProvider) failed with:This made background agents unusable while OpenTelemetry instrumentation was
enabled.
Description & Review Guide
Root cause.
AgentTelemetryLayer.run()is a synchronous method that returnsan awaitable. It eagerly called
.set()on two telemetry context variables —INNER_RESPONSE_TELEMETRY_CAPTURED_FIELDSandINNER_ACCUMULATED_USAGE— in thecaller's execution context, while the non-streaming
_run()coroutine resetthose tokens in its
finallyblock.BackgroundAgentsProviderschedules thereturned coroutine with
asyncio.create_task(agent.run(...)), so the coroutineruns in a new copied context. A
contextvars.Tokencan only be reset in theexact context that created it, so the
.reset()raisedValueError: <Token ...> was created in a different Context.Normal
await agent.run()worked because the eager.set()(caller context) andthe
finally.reset()happened to run in the same context; the bug onlysurfaced when the returned awaitable was scheduled onto a different task — a
legitimate and common pattern.
What are the major changes?
.set()calls into the start of the_run()coroutine body, so the set and the existingfinallyreset alwaysexecute in the same context.
.set()at the top of theif stream:block — the streaming branch returns a
ResponseStreamsynchronously and isconsumed in the same task/context, so its cleanup-hook reset runs in the
context that created the tokens. Background agents are non-streaming, so the
streaming path is unaffected.
(
test_agent_run_contextvars_safe_when_awaited_in_different_context) thatinvokes
run(stream=False)synchronously, then awaits the returned coroutineinside
asyncio.create_task(a different context), and asserts it completeswithout the cross-context
ValueError.What is the impact of these changes?
contextvars are now set and reset within its own context copy, which also
improves isolation across concurrent background agents. No public API changes;
normal
await agent.run()behavior is unchanged.What do you want reviewers to focus on?
.set()into_run()for the non-streaming path (while keepingthe streaming path eager) is the right split, and that the streaming cleanup
hooks still reset in the token-creating context.
Related Issue
Fixes #6762
Contribution Checklist
breaking changelabel (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.