Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion python/packages/core/AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -120,7 +120,10 @@ agent_framework/

- **`AgentLoopMiddleware`** - `AgentMiddleware` that re-runs an agent in a loop by calling `call_next()` repeatedly (the pipeline re-reads `context.messages` each time). One configurable class covers two patterns: a required user `should_continue` predicate (sync or async, the first positional/keyword arg), and a chat-client judge built via the `.with_judge(...)` factory (a second chat client decides whether the original request was answered; loops while it is *not*, using a `JudgeVerdict` structured-output response — internally just an async `should_continue` predicate). The constructor covers the predicate pattern directly; only the judge has a convenience classmethod factory (`.with_judge(judge_client, ...)`) that forwards to `__init__`. Supports both streaming and non-streaming runs. By default a non-streaming run returns an aggregated `AgentResponse` containing every iteration's messages plus the injected `next_message` "nudge" messages (as `user` messages); set `return_final_only=True` to return only the last iteration's response. Streaming runs always yield each iteration's updates and emit the injected nudge messages as `user` updates between iterations (the `return_final_only` flag has no effect on streaming, and the final response reflects the last iteration; `MiddlewareTermination` is handled cleanly). `should_continue` is required; other constructor args are optional: `max_iterations` (safety cap; defaults to `DEFAULT_MAX_ITERATIONS`=10, explicit `None`→unbounded, positive int caps; `.with_judge` uses `DEFAULT_JUDGE_MAX_ITERATIONS`=5 as its default), `next_message` (defaults to a short "continue" nudge), `return_final_only`, and `additional_instructions` (an extra `system` message injected ahead of the input before the agent runs — becomes part of the original messages so it survives `fresh_context` resets and persists via a session). The judge is configured only through `.with_judge` (`judge_client`/`instructions`/`criteria`), not the constructor, and its `reasoning` is fed back to the agent as the next iteration's input; the judge forwards the original request messages and the agent's latest response messages verbatim so multi-modal content is preserved. `criteria` (a `list[str]`) is both injected as the agent's `additional_instructions` and rendered into the judge instructions wherever the `{{criteria}}` placeholder (`CRITERIA_PLACEHOLDER`) appears (`DEFAULT_JUDGE_INSTRUCTIONS` ends with it; custom `instructions` may include it, and it is stripped when no criteria are given). The `should_continue`/`next_message` callables are invoked with keyword args (`iteration`, `last_result`, `messages`, `original_messages`, `session`, `agent`, `progress`, `feedback`) and may be sync or async; declare only what you need plus `**kwargs`. `should_continue` may return a plain `bool` or a `(bool, str | None)` tuple whose second item is feedback surfaced to `next_message`/`record_feedback` via the `feedback` kwarg (the judge uses this to relay its `reasoning`). Stop precedence per iteration is `max_iterations` → `should_continue`, evaluated before `record_feedback` so the feedback is available to it.
- **Feedback tracking** - `record_feedback` captures a per-iteration progress entry (called with the loop kwargs; if it returns a truthy string the entry is appended, otherwise the agent's response text is used as the fallback entry). The accumulated log is exposed to every callback via the `progress` keyword (a per-iteration copy of prior entries) and, when `inject_progress=True` (default), injected into the next iteration's input as a `user` message (the full log without a session, only the latest entry with a session to avoid duplicating history). `fresh_context=True` restarts each iteration from the original task plus the progress log; when a session is attached it is snapshotted (`to_dict()`) before the loop and restored (`from_dict` + field copy) between iterations so the local transcript and any service-side conversation id reset too (in-loop working-state is discarded, pre-loop state preserved, continuity carried only by the progress log).
- **`todos_remaining(provider)`** / **`background_tasks_running(provider)`** - Helper factories returning `should_continue` predicates that loop while a `TodoProvider` has open items, or while a `BackgroundAgentsProvider`'s persisted state shows running tasks.
- **`todos_remaining(*, modes=None)`** / **`todos_remaining_message`** - Helper factories for todo-driven loops (the Python counterpart of .NET's `TodoCompletionLoopEvaluator`), designed for `create_harness_agent` but usable with any agent that registers a `TodoProvider` via `context_providers`. They resolve the `TodoProvider`/`AgentModeProvider` from the *running agent* (`agent.context_providers`, via `_resolve_context_provider`) rather than taking the provider as an argument, so they can be wired directly into `loop_should_continue`/`loop_next_message`. `todos_remaining` returns a `should_continue` predicate that loops while any todo is open; pass `modes=[...]` to gate looping to specific operating modes (case-insensitive; honors the `AgentModeProvider`'s `source_id`/`available_modes`), `modes=None` (default) applies in every mode, and an empty sequence raises `ValueError`. `todos_remaining_message` is a `next_message` callable that lists the still-open todo titles and tells the agent to finish them, returning `None` (→ default nudge) when the session/agent/provider is unavailable or nothing is open.
Comment thread
westey-m marked this conversation as resolved.
Outdated
- **`background_tasks_running(provider)`** - Helper factory returning a `should_continue` predicate that loops while a `BackgroundAgentsProvider`'s persisted state shows running tasks (takes the provider explicitly, unlike `todos_remaining`).
- **Approval escape hatch** - `_has_pending_approval_request(result)` checks whether an iteration's response carries a pending tool-approval request (any content with `type == "function_approval_request"`). Both the streaming and non-streaming loops stop and return that response to the caller *before* evaluating `should_continue`/`max_iterations` or injecting `next_message`, so the loop is HITL-safe even when wrapped outermost around a `ToolApprovalMiddleware` (mirrors the C# `LoopAgent`'s `HasPendingApprovalRequests`).
- **Harness integration** - `create_harness_agent` enables the loop when a `loop_should_continue` callable is passed; it prepends `AgentLoopMiddleware(loop_should_continue, max_iterations=loop_max_iterations, next_message=loop_next_message)` ahead of `ToolApprovalMiddleware` so the loop is the outermost middleware (each iteration is a full agent run including tool approval, and the escape hatch hands pending approvals back to the caller). `loop_next_message` and `loop_max_iterations` only take effect together with `loop_should_continue` (with no `loop_should_continue` there is no loop, so they are ignored); `loop_max_iterations` defaults to the loop's default cap (`None` → unbounded).

### Workflows (`_workflows/`)

Expand Down
2 changes: 2 additions & 0 deletions python/packages/core/agent_framework/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@
JudgeVerdict,
background_tasks_running,
todos_remaining,
todos_remaining_message,
)
from ._harness._memory import (
DEFAULT_MEMORY_SOURCE_ID,
Expand Down Expand Up @@ -598,6 +599,7 @@
"set_agent_mode",
"step",
"todos_remaining",
"todos_remaining_message",
"tool",
"tool_call_args_match",
"tool_called_check",
Expand Down
31 changes: 31 additions & 0 deletions python/packages/core/agent_framework/_harness/_agent.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
from .._sessions import ContextProvider, HistoryProvider, InMemoryHistoryProvider
from .._skills import SkillsProvider
from ._background_agents import BackgroundAgentsProvider
from ._loop import DEFAULT_MAX_ITERATIONS, AgentLoopMiddleware
from ._memory import MemoryContextProvider, MemoryStore
from ._mode import AgentModeProvider
from ._todo import TodoProvider
Expand All @@ -35,6 +36,7 @@
from .._compaction import CompactionStrategy, TokenizerProtocol
from .._middleware import MiddlewareTypes
from .._tools import ToolTypes
from ._loop import NextMessageCallable, ShouldContinueCallable
from ._tool_approval import ToolApprovalRuleCallback

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -254,6 +256,9 @@ def create_harness_agent(
disable_web_search: bool = False,
disable_tool_auto_approval: bool = False,
auto_approval_rules: Sequence[ToolApprovalRuleCallback] | None = None,
loop_should_continue: ShouldContinueCallable | None = None,
loop_next_message: NextMessageCallable | None = None,
loop_max_iterations: int | None = DEFAULT_MAX_ITERATIONS,
otel_provider_name: str | None = None,
context_providers: Sequence[ContextProvider] | None = None,
middleware: Sequence[MiddlewareTypes] | None = None,
Expand All @@ -273,6 +278,7 @@ def create_harness_agent(
- **BackgroundAgentsProvider** — delegate work to background sub-agents
- **Tool approval** — "don't ask again" standing approval rules plus heuristic
auto-approval callbacks
- **Looping** — re-run the agent until a ``should_continue`` predicate is satisfied
- **OpenTelemetry** — observability via ``AgentTelemetryLayer``

Each feature can be disabled or customized via keyword arguments.
Expand Down Expand Up @@ -380,6 +386,19 @@ def create_harness_agent(
content and returns ``True`` to approve it. Rules are evaluated after standing rules
(derived from prior user approvals) but before prompting the user. Only used when
``disable_tool_auto_approval`` is False.
loop_should_continue: Optional predicate that enables the looping middleware. When provided, the
agent is re-run in a loop (via :class:`~agent_framework.AgentLoopMiddleware`, wired as
the outermost middleware so each iteration is a full agent run including tool approval)
for as long as the predicate returns ``True``, up to ``loop_max_iterations``. If an
iteration returns a pending tool-approval request, the loop stops and returns it so the
caller can approve before continuing. When None (default), no loop is added.
loop_next_message: Optional callable controlling the input for the next loop iteration.
Only takes effect when ``loop_should_continue`` is set (otherwise no loop is added and
this is ignored).
loop_max_iterations: Safety cap on the number of loop iterations. ``None`` means unbounded;
a positive integer caps the loop (defaults to the loop middleware's default cap). Only
takes effect when ``loop_should_continue`` is set (otherwise no loop is added and this
is ignored).
otel_provider_name: Custom OpenTelemetry provider/source name for telemetry.
context_providers: Additional context providers to include after the built-in ones.
middleware: Additional middleware to include.
Expand Down Expand Up @@ -475,9 +494,21 @@ def create_harness_agent(
# placed first so it sits outermost: it intercepts inbound "always approve" responses and
# outbound approval requests at the caller boundary, and its re-invocation loop re-runs any
# user-supplied middleware. ToolApprovalMiddleware requires an AgentSession at run time.
# When should_continue is supplied, the loop is prepended ahead of tool approval so it sits
# outermost of all: each loop iteration is a full agent run (including tool approval), and the
# loop's approval escape hatch returns any pending approval request to the caller.
assembled_middleware: list[MiddlewareTypes] = []
if not disable_tool_auto_approval:
assembled_middleware.append(ToolApprovalMiddleware(auto_approval_rules=auto_approval_rules))
if loop_should_continue is not None:
assembled_middleware.insert(
0,
AgentLoopMiddleware(
loop_should_continue,
max_iterations=loop_max_iterations,
next_message=loop_next_message,
),
)
if middleware:
assembled_middleware.extend(middleware)

Expand Down
Loading
Loading