Skip to content

test: fix LLMQ commitment log capture race#7405

Draft
thepastaclaw wants to merge 1 commit into
dashpay:developfrom
thepastaclaw:fix/llmq-commitment-log-capture-crash
Draft

test: fix LLMQ commitment log capture race#7405
thepastaclaw wants to merge 1 commit into
dashpay:developfrom
thepastaclaw:fix/llmq-commitment-log-capture-crash

Conversation

@thepastaclaw

@thepastaclaw thepastaclaw commented Jul 3, 2026

Copy link
Copy Markdown

Fix LLMQ Commitment Log Capture Race

Issue being fixed or feature implemented

Fixes a flaky unit-test crash seen in linux64_multiprocess-build / Build source
while running llmq_commitment_tests/commitment_check_undersized_bitset_debug_log_test.

The failure was a segmentation fault, not a Boost assertion failure. The test
captured log lines into a local vector through LogInstance().PushBackCallback
and then scanned that vector while the callback was still registered. In
RegTestingSetup, background scheduler/LLMQ threads can log concurrently, so a
background callback could append to the vector while the test thread was
iterating it.

What was done?

Held the log-capture guard in a std::optional and reset it immediately after
CheckLLMQCommitment returns, before reading log_lines.

DeleteCallback takes the same logger mutex used for callback dispatch, so once
the reset returns the test knows no logging thread is still mutating the vector.
Exception paths still get the existing RAII cleanup through the guard destructor.

How Has This Been Tested?

Tested locally on macOS arm64:

git diff --check
make -j8 -C src test/test_dash
TARGET=llmq_commitment_tests/commitment_check_undersized_bitset_debug_log_test
./test/test_dash \
  --run_test="$TARGET" \
  --catch_system_errors=no \
  -l test_suite \
  -- DEBUG_LOG_OUT
code-review \
  dashpay/dash \
  upstream/develop \
  fix/llmq-commitment-log-capture-crash \
  "Fix flaky LLMQ commitment unit-test segfault"

The pre-PR review gate returned ship.

Breaking Changes

None. This is test-only.

Checklist

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone

commitment_check_undersized_bitset_debug_log_test scanned its log_lines
vector while its PushBackCallback was still registered. Logger callbacks
run under the logger mutex from whichever thread logs, and RegTestingSetup
keeps background threads (scheduler, LLMQ) alive that log concurrently, so
a push_back could reallocate the vector mid-scan and invalidate the main
thread's iterators, segfaulting the linux64_multiprocess unit-test job.

Hold the RAII capture guard in a std::optional and reset() it after
CheckLLMQCommitment returns but before reading log_lines. DeleteCallback
takes the same logger mutex as callback dispatch, so once reset() returns
no thread can still be mutating the vector. Exception paths keep the
existing guard-destructor cleanup.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@thepastaclaw

Copy link
Copy Markdown
Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown
✅ Action performed

Review finished.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@coderabbitai

coderabbitai Bot commented Jul 3, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: a8bbbc8e-325f-4cd0-bca9-bc51076b8d4c

📥 Commits

Reviewing files that changed from the base of the PR and between 4af395d and 2732bfd.

📒 Files selected for processing (1)
  • src/test/llmq_commitment_tests.cpp

Walkthrough

This change modifies a single test file to fix log capture timing. The LogCaptureGuard in commitment_check_undersized_bitset_debug_log_test is now wrapped in std::optional and explicitly reset before the captured log_lines are searched, ensuring the log callback is removed prior to reading captured output rather than relying on scope-exit destruction.

Estimated code review effort: 1 (Trivial) | ~5 minutes

Possibly related PRs

  • dashpay/dash#7366: Introduces the undersized-validMembers debug-log test that this PR's LogCaptureGuard reset fix directly supports.

Suggested reviewers: PastaPastaPasta, UdjinM6

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly summarizes the main change: fixing an LLMQ commitment log-capture race in tests.
Description check ✅ Passed The description directly matches the test-only fix described in the changeset and objectives.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@github-actions

github-actions Bot commented Jul 3, 2026

Copy link
Copy Markdown

✅ No Merge Conflicts Detected

This PR currently has no conflicts with other open PRs.

@thepastaclaw

Copy link
Copy Markdown
Author

CI scope check for the red linux64_ubsan-test / Test source job: this PR is proper as-is.

The failed job is https://github.com/dashpay/dash/actions/runs/28672149411/job/85040686108 at head 2732bfde8c80656b1f95df212e6becdf402b89f4. The root cause is feature_asset_locks.py failing with the existing intermittent Platform-quorum issue tracked in #7310:

feature_asset_locks.py:350 in test_asset_unlocks
  asset_unlock_tx = self.create_assetunlock(101, COIN, pubkey)
feature_asset_locks.py:110 in create_assetunlock
  quorumHash = mninfo[0].get_node(self).quorum("selectquorum", llmq_type_test, request_id)["quorumHash"]
test_framework.authproxy.JSONRPCException: no quorums active (-1)

Scope check: this PR changes only src/test/llmq_commitment_tests.cpp; it does not touch feature_asset_locks.py, asset-lock logic, the functional test framework, or Platform-quorum mining/selection code. Other source/test jobs passed, and CodeRabbit found no actionable comments. I updated #7310 with this run as another occurrence and did not push, rebase, create an empty commit, or rerun CI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant