fix(running-in-ci): trim CI-poll loops to fit Bash tool's 10-min cap by tend-agent · Pull Request #695 · max-sixty/tend

tend-agent · 2026-06-15T10:06:20Z

Problem

The bundled running-in-ci skill's foreground CI-poll recipe (and the structurally-identical gh run rerun --failed rollup loop) iterated for i in $(seq 1 15); do sleep 60; ...; done — a ≥15-minute wall clock if CI didn't finish early. The Claude Code Bash tool's max configurable timeout is 600000 ms (10 min); past that the harness auto-backgrounds the call. The skill's own policy says background-completion notifications are not reliably delivered to a CI session, so the gated follow-up (dismiss approval on failure, post failure analysis) can't fire — and the bot's sleep workarounds get blocked by the harness's anti-polling guards, ending the session before the dismissal runs.

#694 documents six occurrences in a single 24h window on max-sixty/worktrunk (six of ~20 token-bearing tend-review runs — ~30%). All six PRs merged cleanly so the loss-of-dismissal path was not exercised, but the failure mode is structural and would fail open on a future red-CI run.

This is the same root-cause class as #674 (triage's full-suite gate exceeding the 10-min cap) but in a different code path — the gated CI dismissal path on review.

Solution

Trim both loops from seq 1 15 to seq 1 9. With 60s sleeps per iteration plus the at-most-once 30s grace re-check, the worst case sits comfortably inside the 600000 ms Bash cap so the loop runs to completion and the gated follow-up fires.

Also added a one-line note that callers must invoke Bash with timeout: 600000 — the default 2-min Bash timeout would kill even the trimmed loop early.

Updated the inline references that cited the old 15-minute figure ("Polling for it deadlocks until the 15-min cap breaks it" → "loop cap", "CI still running after 15 minutes" → "9 minutes", "up to ~15 minutes" → "up to ~9 minutes").

Testing

Skill text fix — no automated test. Verified by reading the recipe end-to-end: 9 iterations × 60s + one 30s grace + small gh/jq overhead per iteration stays under 600s. The reporter's evidence (six session log traces showing the auto-background sequence) is the reproduction.

Closes #694 — automated triage

The foreground CI-poll recipe and the rerun-jobs poll both ran `for i in $(seq 1 15); do sleep 60; ...; done` — a minimum of 15 minutes if CI didn't finish early. The Claude Code Bash tool's max configurable timeout is 600000 ms (10 min), past which the harness auto-backgrounds the call and blocks foreground `sleep` waits. Once that happens the gated follow-up (dismiss approval on failure, post failure analysis) can't fire in-session — background-completion notifications are not reliably delivered to a CI session, per the same skill's own policy. Trim both loops to `seq 1 9` (≥9 min minimum) so the recipe fits inside the harness cap, and add a note that callers must invoke Bash with `timeout: 600000` (the default 2-min timeout would kill the loop early). Closes #694 Co-Authored-By: Claude <noreply@anthropic.com>

tend-agent

Self-review of the trim. Math checks out (9 × 60s + at most one 30s grace ≈ 570s, comfortably under the 600s Bash cap) and the inline 15-min/15 minutes references are all updated. One gap: the new timeout: 600000 caller note was only added next to the CI Monitoring loop, not the structurally-identical gh run rerun --failed rollup loop below — see inline. A reader who jumps straight to the rerun recipe still hits the 2-min default Bash timeout.

Self-review caught that the timeout: 600000 caller note was only added next to the CI Monitoring loop, not the structurally-identical gh run rerun --failed rollup loop below. A reader following just the rerun recipe would still hit the default 2-min Bash timeout. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

tend-agent · 2026-06-16T21:54:25Z

Two more occurrences from a different repo, tonight on PRQL/prql. Both are interactive-harness (max-sixty/tend/interactive@0.1.4) reviews of open Dependabot PRs where the bot followed the bundled pending() recipe verbatim with no run_in_background flag (foreground intent, timeout: 1100000–1200000). Both sessions ended around the 12-minute mark with PTY tail 1 shell still running — the harness moved the Bash to its background tasks dir (/tmp/claude-1002/.../tasks/<id>.output) and the Stop hook fired.

Run	PR	Wall	Final assistant text	Outcome
27643779196	PRQL/prql#6013 (js-yaml 4.1.1 → 4.2.0)	11m 52s	"CI poll is running in the background. I'll continue when it completes." (then `ScheduleWakeup(3600s)`, then `end_turn`)	No review
27644108261	PRQL/prql#6014 (Cargo patch group, 8 deps)	12m 27s	"I'll wait for the CI poll to complete." (no wakeup, plain `end_turn`)	No review

A third PR in the same Dependabot batch, PRQL/prql#6011 (insta-cmd 0.6.0 → 0.7.0), got a clean empty-body APPROVED review from the same session shape — its CI finished in ~6 min, well under the 10-min cap, so the loop returned and the verdict shipped.

Two notes on prioritization:

The failure shape on these two is pre-APPROVE, not post-APPROVE. The issue body's documented failure was "if CI flips red after the auto-background, the dismiss-on-failure follow-up will silently not happen" — a fail-open hypothetical. What happened on PRQL/prql#6013 / #6014 is more visible: when the bot decides "CI is in flight, wait then approve" (per running-in-ci's "Read Context" / pre-APPROVE peek), the same auto-background cuts the wait short and no review of any kind is posted. The PRs sit open with no bot verdict, which the maintainer overlay treats as not-blocking-but-missing-the-bot's-pre-merge-check.
The ScheduleWakeup fallback (#6013) didn't recover. This confirms the issue body's "background-completion notifications aren't reliably delivered" point on the interactive harness specifically — even with the wakeup explicitly scheduled, the session terminated and never resumed before the 3600s delay would have fired.

The seq 1 9 trim in this PR addresses both shapes by landing the loop's exit inside one Bash call, so the session can ship its verdict regardless of CI duration up to ~9 min. The pre-existing self-review note about the gh run rerun --failed rollup loop's timeout needing to be set on the caller still holds.

Recording these two as additional structural-confirmation evidence in the PRQL/prql review-reviewers gist — cumulative for this shape is now 3 on PRQL/prql alone, plus the 6 in the issue body on max-sixty/worktrunk.

#721 (base workflow-regen worktree on an open PR, not branch-ref existence) merged to main after this release branch was cut; it's a Fixed-scope change to the bundled nightly skill that adopters run, so it belongs in the 0.1.7 notes alongside #695. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

Bumps generator version to 0.1.7, syncs the lockfile, and adds the `## 0.1.7` CHANGELOG section (published verbatim as the GitHub Release notes on tag push). Changes since 0.1.6: pin actions/checkout to v7 with review opting into the fork-PR checkout guard (#725); bump claude-code to 2.1.185 (#719); running-in-ci surfaces blocking scope rules instead of routing around them (#717) and caps CI-poll loops to the Bash 10-min limit (#695); nightly workflow-regen bases its worktree on an open PR rather than branch-ref existence (#721); de-duplicate composite-action step bodies under shared/steps/ with harness-named action paths (#712); correct the codex effort list (#710); review-reviewers and worker-deploy doc fixes (#707, #711).

tend-agent mentioned this pull request Jun 15, 2026

running-in-ci: foreground CI poll's 15-min loop exceeds the Bash tool's 10-min cap, gated dismissal can't fire #694

Closed

tend-agent commented Jun 15, 2026

View reviewed changes

Comment thread plugins/tend-ci-runner/skills/running-in-ci/SKILL.md

tend-agent mentioned this pull request Jun 16, 2026

review-runs-tracking: 2026-06 #646

Open

max-sixty merged commit 6b03677 into main Jun 20, 2026
7 checks passed

max-sixty deleted the fix/issue-694 branch June 20, 2026 04:38

worktrunk-bot mentioned this pull request Jun 20, 2026

review-runs-tracking: 2026-06 max-sixty/worktrunk#2955

Open

prql-bot mentioned this pull request Jun 23, 2026

review-runs-tracking: 2026-06 PRQL/prql#5972

Open

max-sixty mentioned this pull request Jun 23, 2026

chore: release 0.1.7 #726

Merged

dormouse-bot mentioned this pull request Jun 24, 2026

chore: update tend workflows (0.1.6 → 0.1.7) diffplug/dormouse#172

Open

cargo-affected-bot mentioned this pull request Jun 24, 2026

chore: update tend workflows (0.1.6 → 0.1.7) max-sixty/cargo-affected#59

Open

worktrunk-bot mentioned this pull request Jun 24, 2026

chore: update tend workflows (0.1.6 → 0.1.7) max-sixty/worktrunk#3200

Open

prql-bot mentioned this pull request Jun 24, 2026

chore: update tend workflows (0.1.6 → 0.1.7) PRQL/prql#6033

Open

numbagg-bot mentioned this pull request Jun 24, 2026

chore: update tend workflows (0.1.6 → 0.1.7) numbagg/numbagg#664

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(running-in-ci): trim CI-poll loops to fit Bash tool's 10-min cap#695

fix(running-in-ci): trim CI-poll loops to fit Bash tool's 10-min cap#695
max-sixty merged 2 commits into
mainfrom
fix/issue-694

tend-agent commented Jun 15, 2026

Uh oh!

tend-agent left a comment

Uh oh!

Uh oh!

tend-agent commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tend-agent commented Jun 15, 2026

Problem

Solution

Testing

Uh oh!

tend-agent left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tend-agent commented Jun 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants