Skip to content

fix: scope HITL waiters per agent tool call#22088

Open
fengjikui wants to merge 1 commit into
run-llama:mainfrom
fengjikui:codex/hitl-waiter-id
Open

fix: scope HITL waiters per agent tool call#22088
fengjikui wants to merge 1 commit into
run-llama:mainfrom
fengjikui:codex/hitl-waiter-id

Conversation

@fengjikui

Copy link
Copy Markdown

Fixes #22070.

Summary

Parallel FunctionAgent tool calls can each enter the documented HITL pattern via ctx.wait_for_event(...). When those tool functions do not pass an explicit waiter_id, the workflow runtime derives the same waiter key for every parallel branch, so the later waiter overwrites the earlier one and only one InputRequiredEvent is emitted.

This PR scopes the default waiter id used by agent tool contexts to the current tool call id. Explicit waiter_id values are preserved, so callers that intentionally coordinate multiple waits under a shared id can still do so.

Changes

  • Add a small delegated tool-call context that forwards all Context behavior but fills a missing waiter_id with agent_tool_call:<tool_id>.
  • Pass that scoped context into context-aware FunctionTool calls from both BaseWorkflowAgent and AgentWorkflow.
  • Add a regression test with two parallel HITL tool calls; both now emit InputRequiredEvents and the workflow completes after both responses.

Validation

  • Reproduced the issue on current main with the issue PoC: only 1 of 3 InputRequiredEvents was emitted and the workflow timed out.
  • uv run --python 3.12 --frozen pytest tests/agent/workflow/test_multi_agent_workflow.py::test_parallel_hitl_tool_calls_have_scoped_waiters tests/agent/workflow/test_multi_agent_workflow.py::test_agent_with_hitl tests/agent/workflow/test_multi_agent_workflow.py::test_workflow_pickle_serialize_and_resume -q
  • uv run --python 3.12 --frozen pytest tests/agent/workflow/test_function_call.py::test_call_tool_with_exception tests/agent/workflow/test_function_call.py -q
  • uv run --python 3.12 --frozen ruff check llama_index/core/agent/workflow/base_agent.py llama_index/core/agent/workflow/multi_agent_workflow.py tests/agent/workflow/test_multi_agent_workflow.py
  • uv run --python 3.12 --frozen ruff format llama_index/core/agent/workflow/base_agent.py llama_index/core/agent/workflow/multi_agent_workflow.py tests/agent/workflow/test_multi_agent_workflow.py --check

AI assistance was used to inspect the code path and draft the patch; I manually reviewed the diff and validated it with the checks above.

@fengjikui fengjikui marked this pull request as ready for review June 22, 2026 15:22
@dosubot dosubot Bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: FunctionAgent parallel tool fan-out + ctx.wait_for_event(HumanResponseEvent) collide on a shared waiter_id, hanging all but one branch

1 participant