Fix histogram bin redistribution dropping counts at the range boundary#7139
Open
Osamaali313 wants to merge 1 commit into
Open
Fix histogram bin redistribution dropping counts at the range boundary#7139Osamaali313 wants to merge 1 commit into
Osamaali313 wants to merge 1 commit into
Conversation
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
buildNormalizedHistograms redistributes input bins into binCount equal-width result bins. The last result bin's right edge was computed as resultLeft + dx == left + (binCount-1)*dx + dx, which accumulates floating-point error and can land just below range.right (e.g. for range [0, 1] with binCount 6 it is 0.9999999999999999). An input bin whose right edge sits exactly on range.right then satisfies 'bin.x + bin.dx > resultRight' in the inner loop and the loop breaks without consuming it, so its y counts are silently dropped from the normalized histogram. Clamp the last result bin's right edge to range.right (its true upper bound by construction). Add regression tests asserting count conservation when a bin's right edge sits on range.right.
a9f7e6e to
bc7926a
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
buildNormalizedHistograms(intensorboard/webapp/widgets/histogram/histogram_util.ts) redistributes input bins intobinCountequal-width result bins. InrebuildBins, the last result bin's right edge was computed as:Because
dx = (right - left) / binCountis generally not exactly representable, this sum accumulates floating-point error and can land just belowrange.right. Forrange = [0, 1]withbinCount = 6, the lastresultRightis0.9999999999999999.An input bin whose right edge sits exactly on
range.right(for example a zero-width bin at the maximum value, or the final normal bin's edge) then satisfies the inner-loop guardbin.x + bin.dx > resultRight(1 > 0.9999999999999999), so the loopbreaks without consuming that bin — and itsycounts are silently dropped from the normalized histogram shown to the user.Minimal reproduction (through the public API):
Fix
The last result bin must extend exactly to
range.right, which is by construction the true upper bound of all input data. Clamp it:This is a general fix (not a special-case hack) and leaves all interior bins unchanged. After it, the reproduction above conserves all 6 counts, and normal inputs are unaffected.
Tests
Added a
count conservation at the range boundaryblock tohistogram_util_test.tsasserting that total counts are preserved when a bin's right edge sits onrange.right(both a zero-width boundary bin and a final normal bin). These fail on the current code and pass with the fix.