Skip to content

fix(server-v3): Prevent evicting in-use sessions mid-request#2286

Open
akeimach wants to merge 6 commits into
mainfrom
alyssamaruyama/stg-2424-prevent-long-running-agent-runs-from-ending-sessions-too
Open

fix(server-v3): Prevent evicting in-use sessions mid-request#2286
akeimach wants to merge 6 commits into
mainfrom
alyssamaruyama/stg-2424-prevent-long-running-agent-runs-from-ending-sessions-too

Conversation

@akeimach

@akeimach akeimach commented Jun 28, 2026

Copy link
Copy Markdown
Contributor

why

A long-running agent runs (60s+) could have their sessions ended early via LRU eviction or TTL cleanup. This surfaces as a "Stagehand session was closed" response on actions that have actually succeed.

The root cause was that the session store only refreshed a session's LRU/TTL at request start, which would cause long requests to reach eviction age while still running.

what changed

To fix, we can mark these long running sessions as "in use" to prevent it from expiring while in flight.

The new inUse is a refcount (not a boolean) because concurrent requests can share one session.

test plan

New tests/unit/sessionPinning.test.ts: eviction-skip under capacity, TTL-skip, concurrent holders + non-negative clamp, explicit-end-still-closes, and release-exactly-once-on-abort


Summary by cubic

Prevent in-use sessions from being evicted or TTL-expired mid-request, fixing intermittent “Stagehand session was closed” errors on long agent runs. Adds session pinning with a refcount and a withSession helper, and fixes capacity downsizing to evict all excess entries safely.

  • Bug Fixes

    • Added an inUse refcount; TTL cleanup, hasSession, and LRU eviction skip pinned sessions (eviction walks past in-use nodes). getOrCreateStagehand pins before lazy init() and unpins on failure; new releaseSession(sessionId) decrements (clamped at zero) and refreshes TTL, ignoring unmatched releases.
    • When lowering maxCapacity, evict before applying the new limit and do it sequentially so all excess entries are removed; pinned sessions are skipped, so the cache can briefly exceed capacity until requests finish. Updated streaming and session start to use withSession(...), which releases exactly once after the handler settles and logs release failures via console.error. Unit tests cover eviction/TTL skips, capacity downsizing, concurrent holders, unmatched releases, explicit end, release-once semantics, and release-failure logging.
  • Migration

    • Custom SessionStore implementations must add releaseSession(sessionId) and follow the acquire/release contract.
    • Use withSession(...) instead of manually holding a V3 instance; do not release on client disconnect.

Written for commit 9edfcdb. Summary will update on new commits.

Review in cubic

@changeset-bot

changeset-bot Bot commented Jun 28, 2026

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 9edfcdb

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@browserbasehq/stagehand-server-v3 Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 6 files

Confidence score: 2/5

  • In packages/server-v3/src/lib/withSession.ts, the session can be released before fn completes, so an in-flight handler may keep using a Stagehand instance that has already become evictable, leading to flaky request failures under load — gate session close on true abort and ensure the handler is canceled/finished before release.
  • In packages/server-v3/src/lib/InMemorySessionStore.ts, the lazy V3 init path awaits stagehand.init() before the session is pinned, which creates a race where the session can be evicted during initialization and return/leak an untracked instance — pin first, then run init (or otherwise hold a non-evictable reference through init) before merging.
  • In packages/server-v3/src/lib/InMemorySessionStore.ts, releaseSession() refreshes TTL even for unmatched/double releases, which can keep idle sessions alive longer than intended and distort eviction behavior over time — only refresh TTL when a corresponding pin was actually released.
Architecture diagram
sequenceDiagram
    participant Client as FastifyRequest
    participant WS as withSession
    participant Store as InMemorySessionStore
    participant Evict as LRU Eviction (internal)
    participant TTL as TTL Cleanup (internal)
    participant V3 as V3Instance

    Note over Client,V3: Request handling with session pinning

    Client->>WS: request (sessionId, ctx, fn)
    WS->>Store: getOrCreateStagehand(sessionId, ctx)
    Note over Store: Increment node.inUse (pin)
    Store-->>WS: stagehand (V3)
    WS->>WS: attach 'close' listener on request.raw
    WS->>V3: fn(stagehand) – long-running agent logic
    Note over V3: Agent can take 60s+

    alt Client aborts mid-handler
        Client-->>WS: abort / socket close
        WS->>Store: releaseSession(sessionId)
        Note over Store: Decrement inUse (clamp ≥0)<br/>Refresh TTL
        Store-->>WS: done
        WS->>WS: released flag set
    else Handler completes normally
        V3-->>WS: result
        WS->>WS: finally block
        WS->>Store: releaseSession(sessionId)
        Note over Store: Decrement inUse (clamp ≥0)<br/>Refresh TTL
        Store-->>WS: done
    end
    WS-->>Client: response

    Note over Client,V3: Background eviction/expiry (respects pinning)

    loop LRU eviction (on capacity exceeded)
        Evict->>Store: find LRU node
        Store->>Store: walk from first, skip inUse > 0
        alt found non‑in‑use node
            Store->>Evict: evict that node
            Evict->>Store: deleteSession
        else all nodes in use
            Evict-->>Store: skip eviction
        end
    end

    loop TTL cleanup (periodic)
        TTL->>Store: for each node
        Store->>Store: check expiry <= now AND inUse === 0
        alt expired and not in use
            Store->>Store: deleteSession
        else in use
            Store-->>TTL: skip
        end
    end

    rect rgb(240, 240, 240)
    Note over Client,V3: Explicit endSession overrides pinning
    Client->>Store: endSession(sessionId)
    Store->>V3: close()
    Store->>Store: remove node (ignores inUse)
    end
Loading

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread packages/server-v3/src/lib/InMemorySessionStore.ts Outdated
Comment thread packages/server-v3/src/lib/withSession.ts Outdated
Comment thread packages/server-v3/src/lib/InMemorySessionStore.ts Outdated
@akeimach akeimach force-pushed the alyssamaruyama/stg-2424-prevent-long-running-agent-runs-from-ending-sessions-too branch from e156c37 to f3cb355 Compare June 29, 2026 19:19

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 7 files

Confidence score: 3/5

  • In packages/server-v3/src/lib/withSession.ts, releaseSession errors are currently swallowed, so pinned sessions can remain marked inUse and gradually leak capacity under load; this could cause harder-to-diagnose performance degradation or session exhaustion after merge — log/emit release failures (and ideally fail or retry release paths) before merging so leaks are detectable.
Architecture diagram
sequenceDiagram
    participant Client as API Client
    participant Handler as Request Handler (stream.ts)
    participant WithSession as withSession()
    participant Store as InMemorySessionStore
    participant V3 as V3 Stagehand
    participant LRU as LRU Eviction
    participant TTL as TTL Cleanup

    Note over Client,TTL: NEW: Session Pinning – prevent eviction/expiry while in-flight

    Client->>Handler: HTTP Request (sse/start)
    Handler->>WithSession: call withSession(sessionId, ctx, handlerFn)
    WithSession->>Store: getOrCreateStagehand(sessionId, ctx)
    alt Session not found / expired
        Store-->>WithSession: throw
        WithSession-->>Handler: error
    else Session exists
        Store->>Store: node.inUse += 1    NEW: pin
        opt Lazy initialization (V3 not yet created)
            Store->>V3: new Stagehand()
            Store->>V3: init()
            alt init fails
                Store->>Store: node.inUse -= 1    NEW: undo pin
                Store->>V3: close()
                Store-->>WithSession: throw
            end
        end
        Store-->>WithSession: V3 instance
    end
    WithSession->>WithSession: call handlerFn(stagehand)

    Note over WithSession,TTL: Handler runs – session is pinned

    TTL->>Store: cleanupExpired() (periodic)
    Store->>Store: check node.expiry <= now AND node.inUse === 0
    alt node.inUse > 0
        Store-->>TTL: skip expiry (pinned)    NEW
    else
        Store-->>TTL: delete expired session
    end

    LRU->>Store: evictLru() (on capacity)
    Store->>Store: walk from first, skip nodes with inUse > 0
    alt no evictable node found
        Store-->>LRU: no-op    NEW: skip eviction
    else
        Store->>Store: deleteSession(evictedNode)
    end

    Handler->>WithSession: handlerFn settles (success or error)
    WithSession->>Store: releaseSession(sessionId)    NEW
    Store->>Store: if node.inUse > 0 then node.inUse -= 1 and refresh expiry
    alt multiple concurrent holders
        Store->>Store: inUse still > 0 – remains pinned
    else last holder releases
        Store->>Store: session now eligible for eviction/expiry
    end
    opt unmatched release (inUse already 0)
        Store->>Store: no-op (clamp at 0, no TTL refresh)    NEW
    end
    WithSession-->>Handler: result (propagates error)
    Handler-->>Client: Response (streamed or JSON)

    Note over Client,TTL: endSession bypasses pin
    alt Explicit endSession
        Client->>Handler: (end session request)
        Handler->>Store: endSession(sessionId)
        Store->>Store: delete node regardless of inUse
        Store->>V3: close()
        Store-->>Handler: done
    end
Loading

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread packages/server-v3/src/lib/withSession.ts Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 7 files

Confidence score: 5/5

  • Automated review surfaced no issues in the provided summaries.
  • No files require special attention.
Architecture diagram
sequenceDiagram
    participant Client as HTTP Client
    participant API as API Route Handler
    participant withS as withSession Wrapper
    participant Store as InMemorySessionStore
    participant Stagehand as V3 Stagehand Instance
    participant Cleanup as TTL/LRU Cleanup

    Note over Client,Cleanup: Session Lifecycle with Pin/Release

    Client->>API: POST /start or streaming request
    API->>withS: withSession(sessionId, ctx, handler)
    
    withS->>Store: getOrCreateStagehand(sessionId, ctx)
    Store->>Store: Lookup node in LRU map
    
    alt Session not found or expired
        Store-->>withS: throw Error
        withS-->>API: Error propagates
        API-->>Client: Error response
    else Session found
        Store->>Store: node.inUse += 1 (PIN)
        Store->>Stagehand: Lazy init() if null
        alt init() fails
            Stagehand-->>Store: Error
            Store->>Store: node.inUse -= 1 (unpin)
            Note over Store: Node stays in cache for retry
            Store-->>withS: throw Error
            withS-->>API: Error propagates
            API-->>Client: Error response
        else init() succeeds
            Stagehand-->>Store: V3 instance ready
            Store-->>withS: Return V3 instance
            withS->>withS: Run handler(stagehand)
            
            Note over withS: Handler runs (potentially 60s+)<br/>Session remains pinned
            
            alt Handler succeeds
                withS->>Store: releaseSession(sessionId) in finally
                Store->>Store: node.inUse -= 1 (if > 0)
                Store->>Store: Refresh TTL expiry
                Store-->>withS: OK
                withS-->>API: Handler result
                API-->>Client: Success response
            else Handler throws
                withS->>Store: releaseSession(sessionId) in finally
                Store->>Store: node.inUse -= 1
                Store->>Store: Refresh TTL expiry
                Store-->>withS: OK (or error logged)
                withS-->>API: Error propagates
                API-->>Client: Error response
            end
            
            opt Release failure
                Note over withS,Store: console.error() records leak<br/>without clobbering handler result
            end
        end
    end

    Note over Store: Concurrent requests share session<br/>inUse is a refcount (not boolean)

    par Concurrent request
        Client->>API: Another request for same session
        API->>withS: withSession(same sessionId, ctx, handler)
        withS->>Store: getOrCreateStagehand(sessionId, ctx)
        Store->>Store: node.inUse += 1 (now 2)
        Store-->>withS: Return existing V3 instance
        Note over withS,Store: Both requests run concurrently<br/>Session stays pinned until both release
    end

    Note over Store: TTL and LRU checks skip pinned sessions

    loop Periodic / On-insert cleanup
        Cleanup->>Store: cleanupExpired()
        Store->>Store: Iterate items
        alt node.inUse > 0
            Store->>Store: Skip - session is pinned
        else node.inUse === 0 && expired
            Store->>Store: Delete session
        end
    end

    loop On insert exceeding capacity
        Cleanup->>Store: evictLru()
        Store->>Store: Walk from least-recently-used
        alt node.inUse > 0
            Store->>Store: Skip, move to next
        else node.inUse === 0
            Store->>Store: Evict this session
        end
        Note over Store: If all sessions pinned, skip eviction
    end

    Note over Client,Cleanup: Explicit endSession bypasses pinning

    Client->>API: endSession(sessionId)
    API->>Store: endSession(sessionId)
    Store->>Store: Close browser, delete node
    Note over Store: endSession works even when inUse > 0<br/>(forceful cleanup, not regular eviction)
    Store-->>API: OK
    API-->>Client: Success

    Note over Store: Unmatched release is a no-op
    Store->>Store: releaseSession() with inUse === 0
    Store->>Store: Return immediately, no TTL refresh
Loading

Re-trigger cubic

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 1 file (changes from recent commits).

Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic

Comment thread packages/server-v3/src/lib/InMemorySessionStore.ts Outdated

@cubic-dev-ai cubic-dev-ai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Tip: Review your code locally with the cubic CLI to iterate faster.

Re-trigger cubic

Comment thread packages/server-v3/tests/unit/sessionPinning.test.ts Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant