Skip to content

Cap refresh start when hypertable has tiered data#9811

Open
kpan2034 wants to merge 4 commits into
timescale:mainfrom
kpan2034:cap-invals-at-chunk-min
Open

Cap refresh start when hypertable has tiered data#9811
kpan2034 wants to merge 4 commits into
timescale:mainfrom
kpan2034:cap-invals-at-chunk-min

Conversation

@kpan2034

@kpan2034 kpan2034 commented May 13, 2026

Copy link
Copy Markdown
Member

Currently during a refresh, we process the ranges before the earliest chunk in the hypertable as well. This can lead to potential missing data when tiered data is present but tiered reads are disabled during the refresh.

By capping the refresh range at the start of the earliest chunk, any invalidations before the earliest chunk are no longer processed and removed. Thus, in a subsequent refresh, if tiered data reads are enabled, that data would be materialized in the CAgg.

We only need to do this when the hypertable has tiered data. Any new tiered data will either have been processed before tiering (since it exists in some chunk) or will be inserted after this refresh (which will generate invalidations).

We also cap the refresh start when generating batches for an incremental refresh.

@kpan2034 kpan2034 self-assigned this May 13, 2026
@kpan2034 kpan2034 force-pushed the cap-invals-at-chunk-min branch from 0ef8281 to 3e5c07d Compare May 27, 2026 14:28
@vineethapai vineethapai added this to the v2.28.0 milestone Jun 1, 2026
@kpan2034 kpan2034 force-pushed the cap-invals-at-chunk-min branch from 3e5c07d to 221041a Compare June 5, 2026 20:33
@codecov

codecov Bot commented Jun 5, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 87.17949% with 5 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
tsl/src/continuous_aggs/invalidation.c 81.25% 1 Missing and 2 partials ⚠️
tsl/src/continuous_aggs/refresh.c 77.77% 0 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@kpan2034 kpan2034 force-pushed the cap-invals-at-chunk-min branch from 221041a to fbd5fcd Compare June 5, 2026 21:13
@kpan2034 kpan2034 marked this pull request as ready for review June 5, 2026 21:13
@github-actions

github-actions Bot commented Jun 5, 2026

Copy link
Copy Markdown

@melihmutlu, @natalya-aksman: please review this pull request.

Powered by pull-review

@kpan2034 kpan2034 force-pushed the cap-invals-at-chunk-min branch from fbd5fcd to 1ed30b3 Compare June 5, 2026 21:20
@kpan2034 kpan2034 changed the title Cap refresh range start at dimension minimum Cap refresh start when hypertable has tiered data Jun 5, 2026
@kpan2034 kpan2034 force-pushed the cap-invals-at-chunk-min branch from 1ed30b3 to 3ddfbf5 Compare June 5, 2026 21:23
@kpan2034 kpan2034 requested review from pnthao and removed request for natalya-aksman June 5, 2026 21:24
Comment thread tsl/src/continuous_aggs/invalidation.c Outdated
@kpan2034 kpan2034 force-pushed the cap-invals-at-chunk-min branch 2 times, most recently from f4a57f4 to e9c4ce4 Compare June 10, 2026 20:53
kpan2034 added 3 commits June 10, 2026 14:59
Currently during a refresh, we process the ranges before the earliest
chunk in the hypertable as well. This can lead to potential missing data
when tiered data is present but tiered reads are disabled during the
refresh.

By capping the refresh range at the start of the earliest chunk, any
invalidations before the earliest chunk are no longer processed and
removed. Thus, in a subsequent refresh, if tiered data reads are
enabled, that data would be materialized in the CAgg.

We only need to do this when the hypertable has tiered data. Any new
tiered data will either have been processed before tiering (since it
exists in some chunk) or will be inserted after this refresh (which will
generate invalidations).
Capping is done incorrectly when the OSM chunk is the earliest chunk.
@kpan2034 kpan2034 force-pushed the cap-invals-at-chunk-min branch from e9c4ce4 to ddca3ff Compare June 10, 2026 20:59
Incremental batch generation would generate batches considering tiered
data ranges, even if tiered reads are disabled. This can lead to
invalid batches since we cap the refresh start to the earliest chunk in
the hypertable during the refresh.
@kpan2034 kpan2034 force-pushed the cap-invals-at-chunk-min branch from 6d40793 to ff7856d Compare June 10, 2026 23:43
Comment on lines +228 to +232
----------------------------------------------------------------------
-- Test 5: When the OSM chunk's range is updated to precede the
-- earliest real chunk, the wrong dimension slice is picked up
-- and the refresh is not capped correctly.
----------------------------------------------------------------------

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the comment needs an update as it can cap correctly now.

Comment thread src/dimension_slice.c
Comment on lines +1189 to +1196
*slice = dimension_slice_from_slot(ti->slot);
MemoryContextSwitchTo(old);
Chunk *chunk = ts_chunk_get_by_id((*slice)->fd.chunk_id, true);

if (IS_OSM_CHUNK(chunk))
{
return SCAN_CONTINUE;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

*slice here assigned whether it's an osm chunk or not. Imagine a case where there is no non-osm chunk and all data is tiered, we would end up this *slice already assigned. The caller assumes that it's non-osm as long as it's not NULL which does not hold in this case.

We should either check again in the caller, or nullify *slice if it's osm in the later IS_OSM_CHUNK check here.

Comment on lines +1337 to +1346
int64 earliest_start = invalidation_get_earliest_chunk_start(cagg->data.raw_hypertable_id);
if (earliest_start != INVAL_NEG_INFINITY)
{
Invalidation boundary = { .lowest_modified_value = earliest_start,
.greatest_modified_value = earliest_start };
invalidation_expand_to_bucket_boundaries(&boundary,
cagg->partition_type,
cagg->bucket_function);
earliest_start = boundary.greatest_modified_value;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand why we want to ignore anything before the value initially return by invalidation_get_earliest_chunk_start. But why do we move earliest_start to its bucket end which is a further value?

I feel like skipping the whole bucket as if it's not invalidated may trigger rewrite to use the cagg when the specific bucket is actually stale in the cagg.

@philkra philkra modified the milestones: v2.28.0, v2.28.1, 2.28.2 Jun 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants