Skip to content

fix(transaction): don't let user properties override computed summary metrics#2726

Open
viirya wants to merge 2 commits into
apache:mainfrom
viirya:fix/snapshot-summary-user-prop-override
Open

fix(transaction): don't let user properties override computed summary metrics#2726
viirya wants to merge 2 commits into
apache:mainfrom
viirya:fix/snapshot-summary-user-prop-override

Conversation

@viirya

@viirya viirya commented Jun 27, 2026

Copy link
Copy Markdown
Member

Which issue does this PR close?

What changes are included in this PR?

Building a snapshot summary let a user-supplied snapshot property overwrite a computed metric key, and a non-integer value then panicked total computation. Both are fixed to match iceberg-java.

1. Computed metrics now win over user properties (transaction/snapshot.rs). Previously:

let mut additional_properties = summary_collector.build();
additional_properties.extend(self.snapshot_properties.clone());   // user overwrites computed

Now the user properties are applied first and the computed metrics overwrite them:

let mut additional_properties = self.snapshot_properties.clone();
additional_properties.extend(summary_collector.build());          // computed overwrites user

This matches SnapshotSummary.Builder.build() in iceberg-java, which does builder.putAll(properties) (custom props) first and then metrics.addTo(builder), so computed metrics win on a key collision.

2. update_totals tolerates an unparseable delta instead of panicking (spec/snapshot_summary.rs). The added-*/removed-* parses used .expect("must be parsable as it was just serialized"). A non-integer value (which can also legitimately come from a previous snapshot's summary) panicked the commit. It now skips that total, matching iceberg-java's SnapshotProducer.updateTotal, which wraps the Long.parseLong calls in try { … } catch (NumberFormatException e) { // ignore and do not add total }.

Both Java behaviors verified against apache/iceberg main (core/src/main/java/org/apache/iceberg/SnapshotProducer.java, SnapshotSummary.java).

Are these changes tested?

Two new regression tests:

  • transaction::append::test_snapshot_properties_cannot_override_computed_metrics — a fast-append whose snapshot_properties set added-data-files=9999 and added-records=<non-integer> commits without panicking, and the resulting summary reports the real computed counts (1), not the user values.
  • spec::snapshot_summary::test_update_totals_tolerates_unparseable_added_value — an unparseable added-* value skips that one total while sibling totals still compute, with no panic.

Both fail if the respective fix is reverted (verified). Full iceberg lib suite (1373 tests) passes, clippy + rustfmt clean.

viirya added 2 commits June 27, 2026 10:56
… metrics

Building a snapshot summary merged the user's `snapshot_properties` on top of
the computed metrics, so a caller could overwrite a metric key such as
`added-data-files` with an arbitrary value, corrupting the summary. A
non-integer value then panicked `update_totals`, which parsed the added/removed
deltas with `.expect()`.

This matches iceberg-java's behavior on both points (verified against
`SnapshotProducer.java` and `SnapshotSummary.java` on `main`):

- `SnapshotSummary.Builder.build()` adds custom properties first, then
  `metrics.addTo(builder)`, so computed metrics win on a key collision. We now
  apply `snapshot_properties` first and let the computed metrics overwrite them.
- `SnapshotProducer.updateTotal` wraps parsing in try/catch and skips the total
  on a `NumberFormatException`. `update_totals` now tolerates an unparseable
  added/removed value by skipping that total instead of panicking.

Closes apache#2725

@dannycjones dannycjones left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one nit, a duplicated comment, for this change

otherwise LGTM

I think we should consider merging to get this in the 0.10.0 release?

Comment on lines -509 to +533
let added = summary
.additional_properties
.get(added_property)
.map_or(0, |value| {
value
.parse::<u64>()
.expect("must be parsable as it was just serialized")
});
let removed = summary
.additional_properties
.get(removed_property)
.map_or(0, |value| {
value
.parse::<u64>()
.expect("must be parsable as it was just serialized")
});
// Parse the added/removed deltas, tolerating an unparsable value by skipping
// the total entirely rather than panicking. Computed metrics always overwrite
// user-supplied summary properties (see `SnapshotProducer::summary`), so a bad
// value should only ever come from a previous snapshot's summary; matching
// iceberg-java's `updateTotal`, we ignore it instead of failing the commit.
let parse_delta = |property: &str| -> Option<u64> {
match summary.additional_properties.get(property) {
None => Some(0),
Some(value) => match value.parse::<u64>() {
Ok(v) => Some(v),
Err(parse_err) => {
tracing::warn!(
"Property '{property}' could not be parsed when computing '{total_property}': {parse_err}. \
Skipping total computation.",
);
None
}
},
}
};

let (Some(added), Some(removed)) = (parse_delta(added_property), parse_delta(removed_property))
else {
return;
};

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My bad, I introduced this bug. I wasn't aware that users can overwrite the computed deltas.

The original intent was that at this point in the code, we should be the ones who wrote these values, hence the expect rather than parsing and error handling.

I think we're still open to bad values being passed in so avoiding expect sounds reasonable, although I wonder if we should be just avoiding the possibility of passing values matching totals/added/removed all together.

Related: one of the things I wanted to do is move total updates into the same part of the code as delta calculations - then we wouldn't need deserialize values we just wrote in the first place. Not for this PR though!

Comment on lines +407 to +418
// User-supplied snapshot properties are applied first, then the computed
// metrics overwrite any colliding keys. This matches iceberg-java
// (`SnapshotProducer.summary`), where computed `added-*`/`total-*` values
// are written after user properties so a user cannot shadow them with a
// bad (or merely wrong) value that would corrupt the snapshot summary.
// User-supplied snapshot properties are applied first, then the computed
// metrics overwrite any colliding keys. This matches iceberg-java
// (`SnapshotProducer.summary`), where computed `added-*`/`total-*` values
// are written after user properties so a user cannot shadow them with a
// bad (or merely wrong) value that would corrupt the snapshot summary.
let mut additional_properties = self.snapshot_properties.clone();
additional_properties.extend(summary_collector.build());

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment here is duplicated, maybe a bad copy/paste or rebase.

Comment on lines +412 to +418
// User-supplied snapshot properties are applied first, then the computed
// metrics overwrite any colliding keys. This matches iceberg-java
// (`SnapshotProducer.summary`), where computed `added-*`/`total-*` values
// are written after user properties so a user cannot shadow them with a
// bad (or merely wrong) value that would corrupt the snapshot summary.
let mut additional_properties = self.snapshot_properties.clone();
additional_properties.extend(summary_collector.build());

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is still possible for a user to supply bad values, where the new snapshots value would be 0 as it won't be populated. Should we try and avoid this, for instance by dropping those values?

As a fix, I think this change is fine, but I am wondering if we need to follow-up here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

User snapshot properties can override computed summary metrics and panic total computation

2 participants