HBASE-30238 HBase bulkload replication causes duplicate HFile loading and compaction storm when source RPC times out by mini666 · Pull Request #8380 · apache/hbase

mini666 · 2026-06-19T08:05:51Z

JIRA: https://issues.apache.org/jira/browse/HBASE-30238

This PR adds:

Configurable bandwidth throttling for bulkloaded HFile copy during replication.
In-progress bulkload deduplication for concurrent RPC retries of the same bulkload event.

Testing:

JAVA_HOME=/Users/juke.mini666/.asdf/installs/java/temurin-17.0.6+10 mvn test -pl hbase-server -am -Dtest=TestHFileReplicatorBandwidth,TestReplicationSinkBulkLoadDedup -DfailIfNoTests=false -Dhadoop-three.version=3.4.3 -Dwarbucks.skip=true -Dcheckstyle.skip=true -Dspotbugs.skip=true -Drat.skip=true

…eplication Introduce hbase.replication.bulkload.copy.bandwidth.mb to rate-limit HFile copy from source HDFS in HFileReplicator.Copier. The limit is enforced through a shared RateLimiter across copy threads, and ReplicationSink updates the rate through configuration reload without requiring RegionServer restart.

Track in-progress bulkload events by replication cluster, encoded region, and bulkload sequence number so concurrent RPC retries skip duplicate execution. The key is removed after the current attempt completes or fails, preserving at-least-once retry semantics while avoiding concurrent duplicate loads.

wchevreuil · 2026-06-19T14:48:13Z

    bulkLoadHFileMap.put(tableName, newFamilyHFilePathsList);
  }

+  private static String buildBulkLoadKey(String replicationClusterId, BulkLoadDescriptor bld) {


nit: no need to be static.

wchevreuil

Some questions:

Do we know if the source always retries the same sink, in the event of rpc timeout? Otherwise, we may still have same bulkload entry submitted more than once.
Isn't the throttling going to increase likelihood for such RPC timeouts? If so, can we mention that on code comments?

Coordinate replicated bulkload WAL events across target region servers using ZooKeeper in-progress and completed markers. Add master-side cleanup for completed markers and cover cross-RS retry replay with MiniCluster.

mini666 · 2026-06-21T07:48:22Z

Thanks for the review.

I added a follow-up commit to handle the cross-target-RS retry case. The sink now coordinates replicated bulkload WAL events through ZooKeeper using in-progress and completed markers, so a retry sent to a different target region server can observe that the same bulkload event has already completed and skip reloading the HFiles.

I also added a master-side chore to clean expired completed markers, while keeping them if a matching in-progress marker still exists.

For coverage, I added unit tests for the ZooKeeper event tracker and cleanup chore, a sink-level completed-marker skip test, and a MiniCluster test that replays the exact same bulkload WAL batch first through one target RS sink and then through another target RS sink.

I verified the change with the focused bulkload replication tests, including the new cross-target-RS retry MiniCluster test. The run passed with 13 tests, 0 failures, 0 errors, and 0 skipped.

Keep using BulkLoadDescriptor cluster ids when invoking HFileReplicator so replicated bulkload RPCs do not treat the target cluster as already handled. The ZooKeeper event tracking identity remains unchanged.

wchevreuil

Sorry for the wait. I like the tracker idea, but there are some caveats. Please see my comments inline.

wchevreuil · 2026-06-30T10:44:41Z

+    }
+  }
+
+  public void choreForTesting() {


There's no need for this extra method. Tests in the same package can simply call chore() directly. And the super class already defines same method, only difference being it's synchronized there.

wchevreuil · 2026-06-30T10:47:03Z

+/**
+ * Cleans completed replicated bulk load event markers after they are old enough.
+ */


Please add explanation about how the new configuration properties influence this chore behaviour.

wchevreuil · 2026-06-30T11:20:28Z

+@InterfaceAudience.Private
+public class ReplicationBulkLoadEventTracker {


Having a tracker for this kind of events is a good idea, but we should not make it tight to ZK. There's been an effort to gradually move HBase features out of ZK. Replication queue storage, for example, already has a ZKless alternative (see HBASE-27109).

This tracker, should follow similar approach. We should have it as interface/abstract class and provide both ZK based and table based implementations. However, I don't think this should be implemented in this same PR. My suggestion for you then is to move this tracking logic out of this PR, let's leave this one simply handling the issue on individual RS level. Then once this is merged, you can file a new jira/PR to implement the tracking logic at a global level, covering both ZK and table based registries.

juke-mini666 added 2 commits June 19, 2026 17:00

wchevreuil reviewed Jun 19, 2026

View reviewed changes

juke-mini666 added 2 commits June 21, 2026 09:23

HBASE-30238 Address bulkload replication review comments

9e7665d

HBASE-30238 Add distributed bulkload replication event tracking

1b72829

Coordinate replicated bulkload WAL events across target region servers using ZooKeeper in-progress and completed markers. Add master-side cleanup for completed markers and cover cross-RS retry replay with MiniCluster.

HBASE-30238 Preserve bulkload descriptor cluster ids for replication

28458de

Keep using BulkLoadDescriptor cluster ids when invoking HFileReplicator so replicated bulkload RPCs do not treat the target cluster as already handled. The ZooKeeper event tracking identity remains unchanged.

wchevreuil reviewed Jun 30, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

HBASE-30238 HBase bulkload replication causes duplicate HFile loading and compaction storm when source RPC times out#8380

HBASE-30238 HBase bulkload replication causes duplicate HFile loading and compaction storm when source RPC times out#8380
mini666 wants to merge 5 commits into
apache:masterfrom
mini666:HBASE-30238

mini666 commented Jun 19, 2026

Uh oh!

wchevreuil Jun 19, 2026 •

edited

Loading

Uh oh!

wchevreuil left a comment

Uh oh!

mini666 commented Jun 21, 2026

Uh oh!

wchevreuil left a comment

Uh oh!

wchevreuil Jun 30, 2026

Uh oh!

wchevreuil Jun 30, 2026

Uh oh!

wchevreuil Jun 30, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@InterfaceAudience.Private
		public class ReplicationBulkLoadEventTracker {

Uh oh!

Conversation

mini666 commented Jun 19, 2026

Uh oh!

wchevreuil Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

wchevreuil left a comment

Choose a reason for hiding this comment

Uh oh!

mini666 commented Jun 21, 2026

Uh oh!

wchevreuil left a comment

Choose a reason for hiding this comment

Uh oh!

wchevreuil Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

wchevreuil Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

wchevreuil Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wchevreuil Jun 19, 2026 •

edited

Loading