fix(closes OPEN-10883): correct root step timestamp and use semantic step types#651
Open
viniciusdsmello wants to merge 1 commit into
Open
Conversation
ebf200d to
c023520
Compare
… types Review of the OpenAI Agents SDK tracing processor (OPEN-10883). Addresses several issues affecting how traces render in Openlayer: 1. Root step timestamp (the reported bug): the synthetic "Agent Workflow" root step was constructed inside on_trace_end, so steps.Step.__init__ stamped its start_time with time.time() at trace-end. This placed the parent step after its own children on the timeline and produced an incorrect inferenceTimestamp for the whole trace. The root step is now anchored to the start time captured in on_trace_start, clamped to the earliest child start time. This confirms the issue was in the instrumentation rather than the UI. 2. Step types: agent, function and handoff spans were all emitted as generic UserCallStep, so the backend bucketed them as user_call and lost dedicated rendering. They now use the SDK's typed steps via step_factory (AgentStep/ToolStep/HandoffStep) with their typed fields populated, matching the convention used by the langchain/google_adk/claude_agent_sdk integrations. 3. New SDK span types (openai-agents 0.4.2): guardrail, mcp_tools, speech, transcription and speech_group spans previously fell through to the generic handler. They are now parsed and mapped to GuardrailStep, ToolStep and ChatCompletionStep (speech/transcription) respectively; speech_group remains a generic container step. The TracingProcessor interface itself is unchanged. 4. Dangling spans: when a guardrail tripwire fires the SDK aborts in-flight spans without emitting on_span_end, leaving end_time None. on_trace_end now backfills those steps (bounded by their parent's end time) and flags them with metadata.incomplete so they render with a duration instead of no end. Also adds a guardrail scenario to the existing examples/tracing/openai/openai_agents_tracing.ipynb notebook so it exercises the GUARDRAIL step type alongside the existing agent/tool/handoff steps. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
c023520 to
2deef85
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Review of the OpenAI Agents SDK tracing processor (OPEN-10883). Fixes three issues affecting how traces render in Openlayer.
1. Root step timestamp (the reported bug)
The synthetic "Agent Workflow" root step was constructed inside
on_trace_end, sosteps.Step.__init__stamped itsstart_timewithtime.time()at trace-end. This placed the parent step after its own children on the timeline and produced an incorrectinferenceTimestampfor the whole trace — exactly the rendering issue reported. The correct start time was already captured inon_trace_startbut never applied.Fix: anchor the root step to the
on_trace_starttime, clamped to the earliest child's start time (defensive against minor SDK/wall-clock skew). This confirms the issue was in our instrumentation rather than the UI.2. Semantic step types
agent,function, andhandoffspans were all emitted as genericUserCallStep, so the backend bucketed them asuser_calland lost dedicated rendering. They now use the SDK's typed steps viastep_factory—AgentStep,ToolStep,HandoffStep— with their typed fields populated, matching the convention used by the langchain / google_adk / claude_agent_sdk integrations.3. New SDK span types (
openai-agents0.4.2)guardrail,mcp_tools,speech,transcription, andspeech_groupspans previously fell through to the generic handler. They are now parsed and mapped:TraceType)agentAgentStep(agent)function,mcp_toolsToolStep(tool)handoffHandoffStep(handoff)guardrailGuardrailStep(guardrail)generation/response/speech/transcriptionChatCompletionStep(chat_completion)speech_groupUserCallStep(user_call) — voice containerThe
TracingProcessorinterface itself is unchanged in 0.4.2.Testing
span_dataobjects: timeline invariant (parent start ≤ all children), all 8 step-type mappings, and typed-field population all verified.ruff checkclean; import smoke test; conditional-import regression test (tests/test_integration_conditional_imports.py) passes.🤖 Generated with Claude Code