[dagster-graphql] Tolerate deleted partitions in historical automation evaluations#33939
[dagster-graphql] Tolerate deleted partitions in historical automation evaluations#33939Vamsi-klu wants to merge 1 commit into
Conversation
Greptile SummaryThis PR adds two small defensive helpers to the GraphQL schema layer that catch
Confidence Score: 5/5The change is narrowly scoped to the error-handling path for stale historical subsets; all non-stale call-sites are unaffected and the helpers introduce no new state. Both helpers correctly catch the exception raised by get_partition_keys_in_range when a dynamic partition no longer exists (DagsterInvalidInvocationError), confirmed by reading partitions_definition.py line 109. All six call-sites in the resolver are updated consistently, and the regression tests exercise both the size-access and membership-check paths end-to-end. No files require special attention.
|
| Filename | Overview |
|---|---|
| python_modules/dagster-graphql/dagster_graphql/schema/asset_condition_evaluations.py | Adds two fallback helpers for stale historical partition subsets and replaces all direct .size / contains call-sites; logic is correct for the deleted-partition case. |
| python_modules/dagster-graphql/dagster_graphql_tests/graphql/test_asset_condition_evaluations.py | Adds two regression tests covering the deleted-partition error path via mock; helper _get_key_range_condition_evaluation correctly builds a KeyRangesPartitionsSubset-backed evaluation record. |
Reviews (2): Last reviewed commit: "[dagster-graphql] Tolerate stale partiti..." | Re-trigger Greptile
| def _get_historical_subset_size(subset: SerializableEntitySubset) -> int: | ||
| try: | ||
| return subset.size | ||
| except DagsterInvalidInvocationError: | ||
| return 0 |
There was a problem hiding this comment.
Narrow exception scope may miss the loading-context error
KeyRangesPartitionsSubset.__len__ and __contains__ are both decorated with @require_full_partition_loading_context, which calls check.invariant(...) and raises DagsterInvariantViolationError — not DagsterInvalidInvocationError — when no partition loading context is active. If a GraphQL code path ever reaches these helpers without a context (e.g. in a test that bypasses the normal request setup), the DagsterInvariantViolationError would propagate unhandled and re-expose the same 500 crash. In normal production paths the context is always set, so this is low risk today, but worth documenting or catching the broader DagsterError base class if hardening is desired.
| assert not results.errors | ||
| evaluation = results.data["assetConditionEvaluationForPartition"] | ||
| assert evaluation["evaluationNodes"][0]["status"] == "FALSE" |
There was a problem hiding this comment.
"FALSE" status assertion may surprise future readers
When __contains__ is patched to throw for true_subset, the code falls through to check candidate_subset. Because candidate_subset is HistoricalAllPartitionsSubsetSentinel() — which is not a SerializableEntitySubset — the not isinstance(...) branch fires and yields FALSE. This is correct given the current logic, but it means a deleted partition shows up as FALSE (evaluated-and-rejected) rather than SKIPPED (not evaluated). A brief comment in the test explaining why FALSE (not SKIPPED) is expected would help future maintainers understand the assertion isn't a mistake.
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
acbbf02 to
5fed730
Compare
Summary & Motivation
When a partition is manually removed (e.g. a dynamic partition is deleted), opening the Automation tab of a downstream asset raises a GraphQL error and the tab fails to load (#33910).
Historical automation-condition evaluation records store partition subsets that were captured at evaluation time. After a partition is deleted, materializing those stored subsets —
subset.sizeandpartition_key in subset.subset_value— raisesDagsterInvalidInvocationErrorbecause the partition no longer exists in the partitions definition. The GraphQL resolvers inasset_condition_evaluations.pycall those properties directly while buildingGrapheneAssetConditionEvaluation/ per-partition nodes, so the error propagates out of the query and breaks the whole tab.Fix
Wrap the two access patterns in small helpers that tolerate stale historical subsets:
_get_historical_subset_size(subset)returnssubset.size, falling back to0if the subset can no longer be materialized._historical_subset_contains_partition(subset, key)returnssubset.is_partitioned and key in subset.subset_value, falling back toFalse.A deleted partition is therefore treated as "no longer true / not a candidate" instead of crashing the query. This is a no-op for all non-stale subsets — behaviour only changes on the error path that previously 500'd.
Test Plan
Added two regression tests in
dagster_graphql_tests/graphql/test_asset_condition_evaluations.pythat build aKeyRangesPartitionsSubsetand patch its__len__/__contains__to raiseDagsterInvalidInvocationError(the deleted-partition shape), then assert the records query and the per-partition query both resolve withnumRequested/numTrue == 0instead of erroring.Verified locally: both tests pass on the sqlite GraphQL context variants (
4 passed, 4 skipped— postgres/code-server variants skip on local Docker/gRPC constraints).ruff(pinned0.15.15) check + format clean.closes: #33910
Was generative AI tooling used to author this PR?
Yes — Claude Code (Opus 4.8). Generated-by: Claude Code (Opus 4.8)