[webserver] Accept user-provided tags on the report_asset_materialization and report_asset_observation REST endpoints#33919
Draft
raghav-reglobe wants to merge 1 commit into
Conversation
… + report_asset_observation The Python SDK supports arbitrary tags on runless asset events (AssetMaterialization/AssetObservation tags=...), but the REST endpoints silently drop any field outside their allowlists — so external (non-Python) writers reporting events over REST cannot attach data-version provenance tags like dagster/input_data_version/<upstream>. This adds an optional 'tags' param to both endpoints (json body, or json-encoded query param mirroring 'metadata' handling). Validation is unchanged: tags flow into the existing event construction, where validate_asset_event_tags already exempts system asset event tags and strict-validates the rest, surfacing errors via the existing 400 path. The dedicated data_version param takes precedence over a conflicting dagster/data_version tag. Also fixes a copy-paste typo in the observation handler's construction-error message (said AssetMaterialization). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
8e7cf3a to
d1d691e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary & Motivation
DagsterInstance.report_runless_asset_eventsupports arbitrary event tags viaAssetMaterialization(tags=...)/AssetObservation(tags=...)— but the/report_asset_materialization/and/report_asset_observation/REST endpoints accept only their allowlisted params and silently drop everything else. External writers in non-Python languages (JVM/Go services reporting events for external assets over REST) therefore cannot attach data-version provenance tags such asdagster/input_data_version/<upstream>/dagster/code_version, which are exactly what makes externally-materialized assets participate in data-version/staleness machinery.This adds an optional
tagsparameter to both endpoints:metadatahandling (including the 400 on parse failure), plus a 400 when tags is not a json object.validate_asset_event_tagsalready exempts system asset-event tags and strict-validates the rest; failures surface through the endpoints' existing 400 path.data_versionparam takes precedence over a conflictingdagster/data_versiontag (user tags merge first, param applies after).ReportAssetMatParam/ReportAssetObsParamgaintags; the materialization API-consistency test'sKNOWN_DIFFdocuments that Pipes does not taketags(same aspartition/description).Asset checks are intentionally left out —
ReportAssetCheckEvalParamhas a different shape (severity/passed) andAssetCheckEvaluationhas no equivalent tags concept.Context: we currently run the materialization half as a small runtime patch in production (JVM relays report Iceberg-commit materializations for ~16K external assets with
input_data_versiontags); upstreaming removes the need to carry it.Test Plan
Extended
dagster_webserver_tests/webserver/test_asset_events.py:dagster/input_data_version/...+ custom key), tags via json-encoded query param,data_versionparam precedence over a conflicting tag, 400 on non-json query param, 400 on non-object body tagsdata_versionprecedence, 400 on non-object tagssample_payload+ per-key validation; materializationKNOWN_DIFF)All 10 tests in the file pass locally.
Changelog
The
/report_asset_materialization/and/report_asset_observation/REST endpoints now accept an optionaltagsparameter (json object), allowing runless asset events reported over REST to carry event tags such as data-version provenance — matching the existing Python SDK capability.🤖 Generated with Claude Code