Skip to content

Add one-time versioning override#10763

Open
Shivs11 wants to merge 4 commits into
mainfrom
ss/one-time-move-versioning-override
Open

Add one-time versioning override#10763
Shivs11 wants to merge 4 commits into
mainfrom
ss/one-time-move-versioning-override

Conversation

@Shivs11

@Shivs11 Shivs11 commented Jun 18, 2026

Copy link
Copy Markdown
Member

Summary

  • add one-time versioning override routing state
  • clear one-time override after a successful WFT on the target deployment version
  • carry pending one-time override through CAN, child, retry, and cron paths using existing override inheritance shape
  • update server to the API-Go branch containing the new proto type

Tests

  • go test -tags test_dep ./common/worker_versioning -count=1
  • go test -tags test_dep ./service/history/api/updateworkflowoptions -count=1
  • go test -tags test_dep ./service/history/workflow -run TestMutableStateSuite -count=1
  • go test -tags test_dep ./service/history -run TestTransferQueueActiveTaskExecutorSuite -count=1

Functional tests are intentionally deferred for this draft.


Note

Medium Risk
Changes workflow versioning routing, override inheritance on child/CAN, and mutable-state lifecycle in history—important execution paths with broad test coverage but non-trivial edge cases (races with in-flight WFTs).

Overview
Adds one-time versioning override so operators can route the next workflow task to a target deployment version without permanently pinning the run.

A pending one-time override is treated like a temporary pin for effective deployment/behavior, search attributes, task-queue membership validation, and version reactivation signals. UpdateWorkflowExecutionOptions field masks now accept v0.32 pinned, auto_upgrade, and one_time (including nested target version fields). The override clears when a workflow task completes on the target version; completion on a different version leaves it pending.

Child workflows inherit pending one-time (and pinned) overrides when the target version owns the child task queue; Continue-as-new and reset/reapply paths are covered in tests. go.temporal.io/api is bumped for the new proto oneof.

Reviewed by Cursor Bugbot for commit 6d94c7b. Bugbot is set up for automated code reviews on this repo. Configure here.

@Shivs11 Shivs11 force-pushed the ss/one-time-move-versioning-override branch from 5b4a1a6 to 2380204 Compare June 18, 2026 11:51
var inheritedPinnedOverride *workflowpb.VersioningOverride
if o := mutableState.GetExecutionInfo().GetVersioningInfo().GetVersioningOverride(); worker_versioning.OverrideIsPinned(o) {
inheritedPinnedOverride = o
// Pinned and one-time overrides are inherited if Task Queue of new run is compatible with the override version.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the reason why the child workflow path is receiving this special treatment but none of CAN/retry/cron paths are is because of the following idea:

  • if an operator has a "Started" WFT and were to run a VersioningOverride (One Time Move) operation, and the said WFT returns a command to CAN/retry/cron, we reject those commands since we have a buffered history event thanks to that pending one time move command. Thus, for these three primitives, we shall always only have the case where the task finishes and the pending one time move is cleared, after which we handle commands (i have added versioning tests that are validating this)

  • on the other hand, the child path is a bit different in the sense that we don't reject this start child command even if we do have buffered events.

@Shivs11 Shivs11 force-pushed the ss/one-time-move-versioning-override branch from 2380204 to 389c921 Compare June 18, 2026 12:11
mergeInto.VersioningOverride = mergeFrom.GetVersioningOverride()
}

if _, ok := updateFields["versioningOverride.pinned"]; ok {

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is something which i just over thought while working on this, or something we missed while we wrote update workflow options?

question: why don't we have these mask options for all the paths for api's that were added in the v0.32 phase? cc - @carlydf

have added them here, but could be totally off hence the q

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still working on the rest of my review, but here is one comment I had pending from when I started the review last week!

I think we should only allow writes to the full versioningOverride (no paths that go beyond versioningOverride).

Before there was a oneof, the relevant fields were behavior and deployment (deprecated v0.30 versioning with Deployment Series). As you can see above, I required if any longer paths were specified, all should be specified. We could follow that pattern and accept longer paths but only if all long paths are sent, or we could reject anything longer than versioningOverride. I think the second option is simpler to reason about, especially with the oneof complexity.

Thoughts?

@Shivs11 Shivs11 marked this pull request as ready for review June 18, 2026 13:32
@Shivs11 Shivs11 requested review from a team as code owners June 18, 2026 13:33
Comment thread service/history/workflow/mutable_state_impl.go Outdated

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 6d94c7b. Configure here.

// new deployment
// 2. VersioningOverride.Deployment: this is returned when user has set a PINNED override
// at wf start time, or later via UpdateWorkflowExecutionOptions.
// 2. VersioningOverride target: this is returned when user has set a PINNED override or

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One-time override skips deployment name

Low Severity

When applying a one-time versioning override, updateVersioningOverride updates WorkerDeploymentName only inside the OverrideIsPinned branch. One-time overrides are not pinned in that sense, so executionInfo.WorkerDeploymentName can stay on the pre-override deployment until a later workflow task completion refreshes it.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 6d94c7b. Configure here.

@rkannan82 rkannan82 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will review tests tomorrow.

versioningInfo.DeploymentVersion = worker_versioning.ExternalWorkerDeploymentVersionFromDeployment(wftDeployment)
}
if clearOneTimeOverrideAfterCompletedWFT(versioningInfo, wftDeployment) {
versioningInfo.VersioningOverride = nil

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest adding a log here indicating the move has been fulfilled. (Since this is not a common operation, won't spam)

}

func GetOverrideTargetDeploymentVersion(override *workflowpb.VersioningOverride) *deploymentpb.WorkerDeploymentVersion {
if OverrideIsPinned(override) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use switch case here, and return nil for auto upgrade case? This tells for auto upgrade the actual version is chosen at task dispatch time.

if override.GetAutoUpgrade() {
return enumspb.VERSIONING_BEHAVIOR_AUTO_UPGRADE
} else if override.GetPinned() != nil {
} else if override.GetPinned() != nil || override.GetOneTime() != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest a comment here stating ... this option is treated similar to pinned, except the override is automatically removed after the move has been fulfilled.

versioningInfo.Version = worker_versioning.WorkerDeploymentVersionToStringV31(worker_versioning.DeploymentVersionFromDeployment(wftDeployment))
versioningInfo.DeploymentVersion = worker_versioning.ExternalWorkerDeploymentVersionFromDeployment(wftDeployment)
}
if clearOneTimeOverrideAfterCompletedWFT(versioningInfo, wftDeployment) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add a comment: Must be cleared before computing effective deployment so that the worker's configured versioning behavior kicks in.

return nil
}

func clearOneTimeOverrideAfterCompletedWFT(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Suggest renaming this to: wftCompletedOnVersion
    It is answering -- did the wf task complete on the target version

  2. I would also suggest passing the target version as input and make the caller invoke it only if the override is set. Then the caller code reads naturally. Currently, the clearing semantics is split between caller and this function.
    if (one_time_override_is_set && wftCompletedOnVersion(override_version)) {
    clear the override
    }

} else if override := versioningInfo.GetVersioningOverride(); override != nil {
if override.GetAutoUpgrade() || override.GetPinned() != nil { // v0.32 override behavior
if override.GetAutoUpgrade() || override.GetPinned() != nil || override.GetOneTime() != nil { // v0.32 override behavior
if override.GetAutoUpgrade() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we replace this with call to ExtractVersioningBehaviorFromOverride?

return transition.GetDeployment()
} else if override := versioningInfo.GetVersioningOverride(); override != nil &&
(override.GetBehavior() == enumspb.VERSIONING_BEHAVIOR_PINNED || //nolint:staticcheck // SA1019: worker versioning v0.31 and v0.30
(worker_versioning.GetOverrideOneTimeTargetVersion(override) != nil ||

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just use GetOverrideTargetDeploymentVersion introduced in worker_versioning.go to handle both pinned and one time?

//nolint:staticcheck // SA1019: worker versioning v0.31
if vs := override.GetPinnedVersion(); vs != "" {
v, _ := worker_versioning.WorkerDeploymentVersionFromStringV31(vs)
if override := versioningInfo.GetVersioningOverride(); override != nil {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this cleanup safe?

// If the continue-as-new command says to use InitialVersioningBehavior AutoUpgrade, the new run will start as
// AutoUpgrade in the first task and then assume the SDK-sent behavior on first workflow task completion.
var inheritedPinnedVersion *deploymentpb.WorkerDeploymentVersion
if previousExecutionState.GetEffectiveVersioningBehavior() == enumspb.VERSIONING_BEHAVIOR_PINNED &&

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are not handling the new enum here --- but this seems right because a one time override is already cleared before this function is called. Is that right?

In that case, let's add an else if for onetime case and log warning inside this branch that it should never happen?

@rkannan82 rkannan82 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description is not telling how the different changes fit together and work. The auto generated bugbot captures some of it; I would suggest taking that and moving it to the summary section and revising it.

For example, the key thing that this PR does is:

  • When this one time override is set, it routes the next WFT to a target deployment version, then clears itself and lets the workflow's own versioning behavior take over.
  • Child workflows inherit this one time override ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants