Python: Fix background agent telemetry context error by westey-m · Pull Request #6764 · microsoft/agent-framework

westey-m · 2026-06-26T13:17:32Z

Motivation & Context

Telemetry threw whenever an agent was invoked through the background-agents
provider. Running background agents (e.g. three concurrent tasks via
BackgroundAgentsProvider) failed with:

Task failed: <Token var=<ContextVar name='inner_response_telemetry_captured_fields' ...>> was created in a different Context

This made background agents unusable while OpenTelemetry instrumentation was
enabled.

Description & Review Guide

Root cause. AgentTelemetryLayer.run() is a synchronous method that returns
an awaitable. It eagerly called .set() on two telemetry context variables —
INNER_RESPONSE_TELEMETRY_CAPTURED_FIELDS and INNER_ACCUMULATED_USAGE — in the
caller's execution context, while the non-streaming _run() coroutine reset
those tokens in its finally block. BackgroundAgentsProvider schedules the
returned coroutine with asyncio.create_task(agent.run(...)), so the coroutine
runs in a new copied context. A contextvars.Token can only be reset in the
exact context that created it, so the .reset() raised
ValueError: <Token ...> was created in a different Context.

Normal await agent.run() worked because the eager .set() (caller context) and
the finally .reset() happened to run in the same context; the bug only
surfaced when the returned awaitable was scheduled onto a different task — a
legitimate and common pattern.

What are the major changes?
- Non-streaming path: move the two .set() calls into the start of the
  _run() coroutine body, so the set and the existing finally reset always
  execute in the same context.
- Streaming path: keep the eager .set() at the top of the if stream:
  block — the streaming branch returns a ResponseStream synchronously and is
  consumed in the same task/context, so its cleanup-hook reset runs in the
  context that created the tokens. Background agents are non-streaming, so the
  streaming path is unaffected.
- Both branches are documented with comments explaining the context constraint.
- Added a regression test
  (test_agent_run_contextvars_safe_when_awaited_in_different_context) that
  invokes run(stream=False) synchronously, then awaits the returned coroutine
  inside asyncio.create_task (a different context), and asserts it completes
  without the cross-context ValueError.
What is the impact of these changes?
- Background agents run correctly with instrumentation enabled. Each invocation's
  contextvars are now set and reset within its own context copy, which also
  improves isolation across concurrent background agents. No public API changes;
  normal await agent.run() behavior is unchanged.
What do you want reviewers to focus on?
- That moving the .set() into _run() for the non-streaming path (while keeping
  the streaming path eager) is the right split, and that the streaming cleanup
  hooks still reset in the token-creating context.

Related Issue

Fixes #6762

Contribution Checklist

The code builds clean without any errors or warnings
All unit tests pass, and I have added new tests where possible
The PR follows the Contribution Guidelines
This PR is linked to an issue and there is no other open PR for this issue (see Related Issue above).
This is not a breaking change. If it is a breaking change, add the breaking change label (or add "[BREAKING]" to the title prefix, before or after any language prefix) — a workflow keeps the label and title prefix in sync automatically.

github-actions · 2026-06-26T13:22:28Z

Python Test Coverage Report •

File	Stmts	Miss	Cover	Missing
packages/core/agent_framework
observability.py	1009	87	91%	408, 410–411, 414, 417, 420–421, 426–427, 433–434, 440–441, 448, 450–452, 455–457, 462–463, 469–470, 476–477, 484, 661–662, 861, 865–867, 869, 873–874, 878, 916, 918, 929–931, 933–935, 939, 947, 1071–1072, 1307, 1582–1583, 1694, 1832, 1874–1875, 2057, 2340–2343, 2349, 2375–2376, 2404–2415, 2534, 2537, 2549, 2566, 2570–2571, 2574, 2580, 2689, 2906, 2908
TOTAL	42571	5087	88%

Python Unit Test Overview

Tests	Skipped	Failures	Errors	Time
8342	37 💤	0 ❌	0 🔥	2m 12s ⏱️

Copilot

Pull request overview

Fixes a Python OpenTelemetry instrumentation bug where non-streaming agent invocations could raise ValueError: <Token ...> was created in a different Context when AgentTelemetryLayer.run() is called synchronously and the returned awaitable is later awaited in a different task/context (e.g., BackgroundAgentsProvider via asyncio.create_task(agent.run(...))).

Changes:

Non-streaming agent path: move INNER_RESPONSE_TELEMETRY_CAPTURED_FIELDS / INNER_ACCUMULATED_USAGE contextvar .set() calls into the _run() coroutine so .set() and .reset() always occur in the same execution context.
Streaming agent path: keep eager contextvar .set() where the ResponseStream is created, with added commentary about the context/token constraint.
Add a regression test that calls run(stream=False) synchronously and awaits the returned coroutine inside asyncio.create_task(...).

Reviewed changes

Copilot reviewed 2 out of 3 changed files in this pull request and generated no comments.

File	Description
python/packages/core/agent_framework/observability.py	Adjusts where telemetry contextvars are set to avoid cross-context token resets in non-streaming invocations; adds explanatory comments.
python/packages/core/tests/core/test_observability.py	Adds a regression test covering awaiting `run()`’s returned awaitable in a separate task/context.
python/uv.lock	Updates locked dependency constraints/versions (not described in PR motivation/scope).

github-actions

Automated Code Review

Reviewers: 5 | Confidence: 92%

✓ Correctness

The fix correctly moves contextvar .set() calls into the _run() coroutine body for the non-streaming path, ensuring set and reset always execute in the same context. The streaming path correctly keeps eager setting since it's consumed synchronously in the caller's context. The regression test accurately mimics the BackgroundAgentsProvider pattern. The lock file changes are unrelated dependency version pins. No correctness issues found.

✓ Security Reliability

The fix correctly addresses the cross-context ValueError by moving contextvars.set() calls into the _run() coroutine body for the non-streaming path, ensuring set/reset always happen in the same execution context. The streaming path remains correctly eager since it's consumed synchronously in the same context. The uv.lock changes are dependency version pins unrelated to the core fix. No security or reliability issues found.

✓ Test Coverage

The regression test covers the primary scenario well — verifying that the non-streaming path's contextvar set/reset works correctly when the coroutine is awaited in a different context via asyncio.create_task. The test assertions are meaningful (checks AgentResponse type, span count, and non-error status). However, the error path (where the inner agent raises) in a cross-context scenario is not tested, even though the finally block at line 1947-1949 performs the same token reset that caused the original bug.

✓ Failure Modes

The fix correctly moves contextvar .set() calls into the _run() coroutine body for the non-streaming path, ensuring set/reset pairs always execute in the same context. The streaming path appropriately keeps eager .set() since its cleanup runs synchronously in the same context. The regression test properly exercises the asyncio.create_task pattern used by BackgroundAgentsProvider. The uv.lock changes are unrelated dependency version pins. No failure modes introduced.

✓ Design Approach

The non-streaming fix is on the right track, but the streaming branch still relies on the same cross-context assumption that caused the original bug. Because ResponseStream cleanup hooks run in the context that consumes the stream, not the context that created it, eagerly setting the telemetry ContextVars before returning the stream can still raise the same Token was created in a different Context error when a stream is consumed from another task/context.

Automated review by westey-m's agents

github-actions

Automated Code Review

Reviewers: 5 | Confidence: 91%

✓ Correctness

The core fix is correct: moving the contextvars .set() calls into the _run() coroutine body for the non-streaming path ensures set/reset always happen in the same execution context, fixing the cross-context ValueError when asyncio.create_task schedules the returned coroutine. The test properly reproduces the background-agent pattern. The streaming path concern is already covered by an existing unresolved review comment. No new correctness issues found.

✓ Security Reliability

The non-streaming fix is correct: moving the contextvar .set() calls into the _run() coroutine body ensures set/reset always happen in the same execution context, resolving the cross-context ValueError for background agents. The try/finally block properly guarantees token cleanup on all paths. The streaming path concern is already covered by an existing unresolved review comment and is not restated here. No new security or reliability issues found in this change.

✓ Test Coverage

The regression test is well-crafted and directly exercises the exact failure scenario (non-streaming run() called synchronously, then awaited via asyncio.create_task in a different context). Assertions are meaningful — checking successful completion, span count, and no error status. The fix itself correctly co-locates the contextvar .set() and .reset() inside the _run() coroutine body for the non-streaming path. One minor test coverage gap: no error-path variant of the cross-context test (agent raises inside _run() while awaited in a different context). The streaming cross-context concern is already flaged by an existing unresolved review comment, so I won't duplicate it.

✓ Failure Modes

The non-streaming path fix is correct: moving the contextvar .set() calls into the _run() coroutine body ensures set and reset always execute in the same context, fixing the cross-context ValueError for BackgroundAgentsProvider. The finally block at lines 1948-1949 properly resets both tokens. The streaming path concern (cleanup hooks running in the consumer's context rather than the token-creating context) is already captured by the existing unresolved review comment. No additional silent failures, lost errors, or operational failure modes are introduced by this change.

✓ Design Approach

I did not find any additional design-approach issues beyond the already-open streaming-context concern in the review thread. The non-streaming change matches the documented background-agent call path, and the new regression test covers the specific cross-context await pattern the PR is fixing.

Suggestions

Consider adding an error-path variant of the cross-context regression test — e.g. an agent whose run() returns a coroutine that raises, awaited via asyncio.create_task. This would verify the finally block (observability.py:1947-1949) properly resets the tokens even when the exception propagates across context boundaries. The mechanism should already work correctly since both .set() and finally .reset() live inside _run(), but an explicit test would guard against future regressions.

Automated review by westey-m's agents

eavanvalkenburg · 2026-06-26T14:14:51Z

    { name = "ag-ui-protocol", specifier = ">=0.1.16,<0.2" },
    { name = "agent-framework-core", editable = "packages/core" },
-    { name = "fastapi", specifier = ">=0.115.0,<0.138.1" },
+    { name = "fastapi", specifier = ">=0.115.0,<0.133.1" },


you're downgrading multiple packages, that can't be right

TaoChenOSU · 2026-06-26T23:00:22Z

+                # The contextvars are set lazily on the first pull, which only happens after this
+                # synchronous setup phase returns the stream, so the tokens are still unset here and
+                # there is nothing to reset.


nit: the comments here isn't associated with any actual code. I suspect these are copilot notes. Maybe we can remove.

Fix issue when using background agents with telemetry

0acbe23

Copilot AI review requested due to automatic review settings June 26, 2026 13:17

moonbox3 added the python Usage: [Issues, PRs], Target: Python label Jun 26, 2026

Copilot started reviewing on behalf of westey-m June 26, 2026 13:19 View session

Copilot AI reviewed Jun 26, 2026

View reviewed changes

github-actions Bot reviewed Jun 26, 2026

View reviewed changes

Comment thread python/packages/core/agent_framework/observability.py Outdated

westey-m marked this pull request as ready for review June 26, 2026 13:27

github-actions Bot reviewed Jun 26, 2026

View reviewed changes

Address PR comments

6a14aa5

eavanvalkenburg reviewed Jun 26, 2026

View reviewed changes

TaoChenOSU reviewed Jun 26, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python: Fix background agent telemetry context error#6764

Python: Fix background agent telemetry context error#6764
westey-m wants to merge 2 commits into
microsoft:mainfrom
westey-m:python-backgroundagents-telemetry

westey-m commented Jun 26, 2026

Uh oh!

github-actions Bot commented Jun 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot left a comment

Uh oh!

Uh oh!

github-actions Bot left a comment

Uh oh!

eavanvalkenburg Jun 26, 2026

Uh oh!

TaoChenOSU Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Uh oh!

Conversation

westey-m commented Jun 26, 2026

Motivation & Context

Description & Review Guide

Related Issue

Contribution Checklist

Uh oh!

github-actions Bot commented Jun 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Python Unit Test Overview

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✓ Failure Modes

✓ Design Approach

Uh oh!

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Automated Code Review

✓ Correctness

✓ Security Reliability

✓ Test Coverage

✓ Failure Modes

✓ Design Approach

Suggestions

Uh oh!

eavanvalkenburg Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

TaoChenOSU Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

github-actions Bot commented Jun 26, 2026 •

edited

Loading