Skip to content

fix: clean up deleted jobs from orchestrator during scenario deletion#2884

Open
PrathamRanka wants to merge 1 commit into
Avaiga:developfrom
PrathamRanka:bug/#999-scenario-deletion-job-cleanup
Open

fix: clean up deleted jobs from orchestrator during scenario deletion#2884
PrathamRanka wants to merge 1 commit into
Avaiga:developfrom
PrathamRanka:bug/#999-scenario-deletion-job-cleanup

Conversation

@PrathamRanka

Copy link
Copy Markdown

What type of PR is this? (Check all that apply)

  • 🐛 Bug Fix

Description

Deleting a scenario while jobs are still running or blocked could leave stale job references in the orchestrator. When a remaining job later completed, the orchestrator attempted to unblock dependent jobs that had already been deleted along with the scenario.

Because the deleted jobs' input data nodes no longer existed, dependency resolution could retrieve None from the data manager and subsequently raise:

AttributeError: 'NoneType' object has no attribute 'is_ready_for_reading'

This PR ensures that deleted jobs are removed from the orchestrator's internal scheduling queues, preventing stale job references from being processed after scenario deletion.

Related Tickets & Documents

How to reproduce the issue

  1. Create two scenarios sharing the same workflow.
  2. Submit both scenarios.
  3. Delete one scenario while jobs are still running or blocked.
  4. Allow the remaining jobs to complete.
  5. Observe the orchestrator attempting to unblock deleted jobs and crashing with:
AttributeError: 'NoneType' object has no attribute 'is_ready_for_reading'

The added regression test reproduces this workflow and verifies the fix.

Changes

_orchestrator.py

  • Added _remove_jobs() to safely remove deleted job IDs from orchestrator scheduling queues.
  • Cleans up blocked_jobs and jobs_to_run entries associated with deleted jobs.

_job_manager.py

  • Updated _delete() to notify the orchestrator when jobs are deleted.
  • Added _delete_many() support to perform the same cleanup for bulk deletions.

test_scenario_manager.py

  • Added test_hard_delete_scenario_while_jobs_are_blocked.
  • Verifies that scenario deletion during execution does not leave stale orchestrator references.
  • Confirms no crash occurs when remaining jobs complete.

Backporting

This change should be backported to:

  • 3.0
  • 3.1
  • 4.0
  • develop

Checklist

  • ✅ This solution meets the acceptance criteria of the related issue.
  • 📝 The related issue checklist is completed.
  • 🧪 This PR includes unit tests for the developed code.
  • 🔄 End-to-End tests have been added or updated.

Reason: The issue is fully covered by a regression/unit test exercising the orchestrator and scenario deletion workflow.

  • 📚 The documentation has been updated, or a dedicated issue has been created.

Reason: Internal bug fix with no user-facing API or behavior changes requiring documentation updates.

  • 📌 The release notes have been updated.

Reason: Small internal bug fix; release note inclusion can be decided by maintainers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[🐛 BUG] Deleting a scenario with a running submission blocks all other submissions

1 participant