Skip to content

fix: refactor forest-cli snapshot command#7252

Open
hanabi1224 wants to merge 19 commits into
mainfrom
hm/refactor-cli-snapshot-cmd
Open

fix: refactor forest-cli snapshot command#7252
hanabi1224 wants to merge 19 commits into
mainfrom
hm/refactor-cli-snapshot-cmd

Conversation

@hanabi1224

@hanabi1224 hanabi1224 commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Summary of changes

On top of #7249 and #7254

Changes introduced in this pull request:

  • move file manipulation logic from cli to server
  • add CI test to ensure export command is kill-resilient
  • update snapshot checksum file extension from forest.car.sha256sum to forest.car.zst.sha256sum

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation. All new code adheres to the team's documentation standards,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

Outside contributions

  • I have read and agree to the CONTRIBUTING document.
  • I have read and agree to the AI Policy document. I understand that failure to comply with the guidelines will lead to rejection of the pull request.

Summary by CodeRabbit

  • New Features
    • Snapshot and snapshot-diff exports now run asynchronously with more reliable progress and completion/cancellation messaging.
    • Export-diff now reports explicit done/cancelled outcomes for clearer CLI results.
  • Bug Fixes
    • Prevent concurrent snapshot export jobs and overlapping diff ranges from starting.
    • Improved cancellation handling and export-status waiting so exports terminate and transition state cleanly.
    • Successful exports now produce checksum sidecar files; completion no longer depends on arbitrary progress thresholds.
  • Tests
    • Updated snapshot export/diff checks to validate async state, cancellation, and concurrency behavior.

@coderabbitai

coderabbitai Bot commented Jun 29, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: bdca94d9-dd36-4d41-8058-5582fd7ed9d7

📥 Commits

Reviewing files that changed from the base of the PR and between 4bb38c2 and 48429ef.

📒 Files selected for processing (1)
  • src/rpc/methods/chain.rs
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • filecoin-project/lotus (manual)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/rpc/methods/chain.rs

Walkthrough

Snapshot export now runs in detached RPC tasks with temp-file persistence and checksum sidecars, the CLI uses updated status and cancellation handling, and calibnet scripts verify asynchronous export state, cancellation, and concurrent-request rejection.

Changes

Detached Export Execution and Concurrency

Layer / File(s) Summary
Path helpers and export result shape
src/db/car/forest.rs, src/rpc/types/mod.rs, src/lib.rs
Adds temp and checksum path helpers, changes ApiExportResult::Done to a unit variant, and re-exports FutureExt in the crate prelude.
Detached chain export RPC handling
src/rpc/methods/chain.rs, src/ipld/util.rs
Updates ForestChainExport::handle and ForestChainExportDiff::handle to run in detached tasks with temp paths, checksum-sidecar writing, anyhow::ensure! depth validation, and ApiExportResult cancellation handling.
Diff export store threading
src/tool/subcommands/archive_cmd.rs
Changes the diff export branch in do_export to pass store.shallow_clone() directly into stream_chain.
Snapshot CLI export flow updates
src/cli/subcommands/snapshot_cmd.rs
Makes snapshot export paths absolute, removes client-side persistence and checksum writing, switches byte tracking to the temp export path helper, changes export-status --wait termination, and uses a cancellation token in export-diff progress handling.
Calibnet export concurrency checks
scripts/tests/calibnet_export_check.sh, scripts/tests/calibnet_export_diff_check.sh
Updates the calibnet export scripts to run exports in the background, poll export status, verify cancellation state, reject overlapping export requests, and confirm the original export still completes after the CLI process is killed.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • ChainSafe/forest#7249: Introduces ChainExportGuard-based locking and cancellation in the RPC chain export flow that this PR builds directly upon.
  • ChainSafe/forest#6128: Adds export-status and export-cancel CLI/RPC plumbing and changes export-result/status handling that this PR extends.
  • ChainSafe/forest#6074: Touches snapshot export-diff end-to-end across snapshot_cmd.rs, chain.rs, and rpc/types/mod.rs — the same files this PR modifies.

Suggested labels

RPC

Suggested reviewers

  • akaladarshi
  • LesnyRumcajs
  • sudo-shashank
🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change as a refactor of the forest-cli snapshot command.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch hm/refactor-cli-snapshot-cmd
✨ Simplify code
  • Create PR with simplified code
  • Commit simplified code in branch hm/refactor-cli-snapshot-cmd

Comment @coderabbitai help to get the list of available commands.

@hanabi1224 hanabi1224 added the Snapshot Run snapshot tests label Jun 29, 2026
@hanabi1224 hanabi1224 force-pushed the hm/refactor-cli-snapshot-cmd branch 5 times, most recently from 31d4dfb to ff8dac3 Compare June 29, 2026 14:54
@hanabi1224 hanabi1224 force-pushed the hm/refactor-cli-snapshot-cmd branch from e35819a to bc627d6 Compare June 29, 2026 15:54
@hanabi1224 hanabi1224 force-pushed the hm/refactor-cli-snapshot-cmd branch from d064942 to bc7eb69 Compare June 30, 2026 03:38
@hanabi1224 hanabi1224 marked this pull request as ready for review June 30, 2026 04:01
@hanabi1224 hanabi1224 requested a review from a team as a code owner June 30, 2026 04:01
@hanabi1224 hanabi1224 removed the request for review from a team June 30, 2026 04:01

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/rpc/methods/chain.rs (1)

280-280: 🔒 Security & Privacy | 🟠 Major | ⚡ Quick win

Do not let a read-permission RPC write arbitrary daemon paths.

Both handlers accept output_path from RPC params and create/persist files on the server while still using Permission::Read. A read token can overwrite or corrupt files writable by the daemon. Raise the permission level or constrain exports to a configured safe directory with path/symlink validation.

Also applies to: 297-308, 343-348, 398-398, 491-491, 526-542

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` at line 280, The RPC handlers in chain.rs are
allowing file writes while still declaring Permission::Read, so a read-only
token can persist or overwrite arbitrary daemon files via output_path. Update
the affected handler permissions (for example in the relevant PERMISSION
constants and export/save flows) to a higher privilege level, or restrict
exports through a validated safe directory. Add path and symlink checks in the
write/persist logic used by the affected chain RPC methods so only approved
destinations can be written.
🧹 Nitpick comments (5)
src/db/car/forest.rs (1)

511-521: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Add rustdoc for the exported path helpers.

These helpers are public API; add short doc comments describing the derived temp/checksum paths. As per coding guidelines, **/*.rs: Document public functions and structs with doc comments.

Suggested docs
+/// Returns the temporary snapshot path used while exporting `output_path`.
 pub fn tmp_exporting_forest_car_path(output_path: &Path) -> PathBuf {
     let mut p = output_path.to_owned();
     p.add_extension("tmp");
     p
 }

+/// Returns the SHA-256 checksum sidecar path for `output_path`.
 pub fn forest_car_sha256sum_path(output_path: &Path) -> PathBuf {
     let mut p = output_path.to_owned();
     p.add_extension("sha256sum");
     p
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/db/car/forest.rs` around lines 511 - 521, The public helper functions
tmp_exporting_forest_car_path and forest_car_sha256sum_path in forest.rs need
rustdoc comments. Add short doc comments above each function describing that
they derive the temporary export path and the checksum sidecar path from the
given output_path, keeping the docs concise and aligned with the existing public
API style.

Source: Coding guidelines

src/rpc/methods/chain.rs (2)

312-408: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Log detached task failures before returning them.

If the client disconnects, the JoinHandle is dropped and any later Err from the spawned export task is discarded. Wrap the task body and log failures inside the task so kill-resilient exports remain observable.

Also applies to: 503-552

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 312 - 408, The spawned export task in
the chain export method can fail after the client disconnects, and those errors
are currently only returned through the JoinHandle so they may be dropped.
Update the tokio::spawn body in the export flow (and the analogous export path
mentioned in the review) to wrap the task work in error logging, so any Err from
crate::chain::export / crate::chain::export_v2 or related setup is logged inside
the task before being returned. Keep the existing handle.await?? path, but
ensure the task itself emits a tracing::error with enough context when it fails.

298-308: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick win

Attach path context to filesystem failures.

The new server-side export path has several bare ? filesystem operations, so users can get unactionable I/O errors after long exports. Add .with_context(...) including the affected path. As per coding guidelines, **/*.rs: Use anyhow::Result<T> for most operations and add context with .context() when errors occur.

Also applies to: 343-348, 398-398, 526-542

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/rpc/methods/chain.rs` around lines 298 - 308, The server-side export flow
in chain.rs has several filesystem operations that still use bare `?`, so attach
path-aware context to each failure using `.with_context(...)` or `.context(...)`
with the affected path. Update the checksum write logic and the other
export-related filesystem calls in the same export path (including the code
around the snapshot/export helpers) so errors surface which file or directory
failed, while keeping the existing anyhow::Result<T> style in functions like the
export routines.

Source: Coding guidelines

scripts/tests/calibnet_export_check.sh (1)

85-90: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Reap the killed CLI process before waiting on server completion.

After sending SIGKILL, wait for the background forest-cli and ignore its expected non-zero status; this keeps the kill-resilience test from leaving an unreaped child while export-status --wait verifies the detached server job.

Proposed fix
 # Killing the CLI should not cancel the export
 echo "killing cli command"
-kill -KILL $EXPORT_CMD_PID
+kill -KILL "$EXPORT_CMD_PID"
+wait "$EXPORT_CMD_PID" || true
 # Wait on the same export job
 echo "waiting on export-status"
 $FOREST_CLI_PATH snapshot export-status --wait
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/tests/calibnet_export_check.sh` around lines 85 - 90, The
kill-resilience test leaves the background forest-cli process unreaped after
SIGKILL; in scripts/tests/calibnet_export_check.sh, update the export-check flow
around EXPORT_CMD_PID so the killed CLI is waited on and its expected non-zero
exit is ignored before running snapshot export-status --wait. Keep the fix local
to the shell test logic by reaping the background job after kill -KILL and
before checking the detached server-side export completion.
scripts/tests/calibnet_export_diff_check.sh (1)

75-78: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Reap the killed CLI process before waiting on server completion.

After sending SIGKILL, wait for the background forest-cli and ignore its expected non-zero status; this keeps the kill-resilience test from leaving an unreaped child while export-status --wait verifies the detached server job.

Proposed fix
 # Killing the CLI should not cancel the export
-kill -KILL $EXPORT_CMD_PID
+kill -KILL "$EXPORT_CMD_PID"
+wait "$EXPORT_CMD_PID" || true
 # Wait on the same export job
 $FOREST_CLI_PATH snapshot export-status --wait
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@scripts/tests/calibnet_export_diff_check.sh` around lines 75 - 78, The
kill-resilience test is leaving the background forest-cli child unreaped after
SIGKILL. In scripts/tests/calibnet_export_diff_check.sh, after kill -KILL on
EXPORT_CMD_PID, explicitly wait on that PID and ignore its expected failure
before calling snapshot export-status --wait, so the test reaps the killed CLI
while still verifying the detached export job via FOREST_CLI_PATH.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/cli/subcommands/snapshot_cmd.rs`:
- Around line 179-180: The `ForestChainExport` flow still runs checksum/persist
logic even when `dry_run` is enabled, which breaks the no-output semantics now
that persistence moved server-side. Update the export handling in `snapshot_cmd`
so the `ApiExportResult::Done` path only proceeds with checksum/persist behavior
when `dry_run` is false, and make sure the `VoidAsyncWriter` branch is
effectively short-circuited before any server-side persistence work. Ensure the
`ApiExportResult` completion path preserves the simplified CLI behavior for dry
runs.
- Around line 327-333: In snapshot_cmd.rs, the progress task cleanup in the
export flow should run even when the RPC call fails. Update the logic around the
ForestChainExportDiff request in the export path so you store the await result
first, then call cancellation_token.cancel(), finish the progress bar, and await
the handle before applying the error with ?. Use the export_result and handle
cleanup sequence to ensure the spawned progress task is always stopped before
bubbling up RPC errors.

In `@src/rpc/methods/chain.rs`:
- Around line 345-399: The dry-run path in the export logic still executes
persistence work after using VoidAsyncWriter, which causes failures and side
effects. Update the chain export success handling in the export match/select
flow so that when dry_run is true it skips both save_checksum and
tmp_path.persist, and only performs those operations in the non-dry-run branch;
use the existing dry_run flag and the chain_export/result handling to gate the
finalization logic.

---

Outside diff comments:
In `@src/rpc/methods/chain.rs`:
- Line 280: The RPC handlers in chain.rs are allowing file writes while still
declaring Permission::Read, so a read-only token can persist or overwrite
arbitrary daemon files via output_path. Update the affected handler permissions
(for example in the relevant PERMISSION constants and export/save flows) to a
higher privilege level, or restrict exports through a validated safe directory.
Add path and symlink checks in the write/persist logic used by the affected
chain RPC methods so only approved destinations can be written.

---

Nitpick comments:
In `@scripts/tests/calibnet_export_check.sh`:
- Around line 85-90: The kill-resilience test leaves the background forest-cli
process unreaped after SIGKILL; in scripts/tests/calibnet_export_check.sh,
update the export-check flow around EXPORT_CMD_PID so the killed CLI is waited
on and its expected non-zero exit is ignored before running snapshot
export-status --wait. Keep the fix local to the shell test logic by reaping the
background job after kill -KILL and before checking the detached server-side
export completion.

In `@scripts/tests/calibnet_export_diff_check.sh`:
- Around line 75-78: The kill-resilience test is leaving the background
forest-cli child unreaped after SIGKILL. In
scripts/tests/calibnet_export_diff_check.sh, after kill -KILL on EXPORT_CMD_PID,
explicitly wait on that PID and ignore its expected failure before calling
snapshot export-status --wait, so the test reaps the killed CLI while still
verifying the detached export job via FOREST_CLI_PATH.

In `@src/db/car/forest.rs`:
- Around line 511-521: The public helper functions tmp_exporting_forest_car_path
and forest_car_sha256sum_path in forest.rs need rustdoc comments. Add short doc
comments above each function describing that they derive the temporary export
path and the checksum sidecar path from the given output_path, keeping the docs
concise and aligned with the existing public API style.

In `@src/rpc/methods/chain.rs`:
- Around line 312-408: The spawned export task in the chain export method can
fail after the client disconnects, and those errors are currently only returned
through the JoinHandle so they may be dropped. Update the tokio::spawn body in
the export flow (and the analogous export path mentioned in the review) to wrap
the task work in error logging, so any Err from crate::chain::export /
crate::chain::export_v2 or related setup is logged inside the task before being
returned. Keep the existing handle.await?? path, but ensure the task itself
emits a tracing::error with enough context when it fails.
- Around line 298-308: The server-side export flow in chain.rs has several
filesystem operations that still use bare `?`, so attach path-aware context to
each failure using `.with_context(...)` or `.context(...)` with the affected
path. Update the checksum write logic and the other export-related filesystem
calls in the same export path (including the code around the snapshot/export
helpers) so errors surface which file or directory failed, while keeping the
existing anyhow::Result<T> style in functions like the export routines.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: 83bd5ff2-56da-44e5-aa72-11989377ac9f

📥 Commits

Reviewing files that changed from the base of the PR and between bc4339c and 95be6ed.

⛔ Files ignored due to path filters (3)
  • src/rpc/snapshots/forest__rpc__tests__rpc__v0.snap is excluded by !**/*.snap
  • src/rpc/snapshots/forest__rpc__tests__rpc__v1.snap is excluded by !**/*.snap
  • src/rpc/snapshots/forest__rpc__tests__rpc__v2.snap is excluded by !**/*.snap
📒 Files selected for processing (8)
  • scripts/tests/calibnet_export_check.sh
  • scripts/tests/calibnet_export_diff_check.sh
  • src/cli/subcommands/snapshot_cmd.rs
  • src/db/car/forest.rs
  • src/lib.rs
  • src/rpc/methods/chain.rs
  • src/rpc/types/mod.rs
  • src/tool/subcommands/archive_cmd.rs
🔗 Linked repositories identified

CodeRabbit considers these linked repositories for cross-repo context during reviews:

  • filecoin-project/lotus (manual)

Comment thread src/cli/subcommands/snapshot_cmd.rs
Comment thread src/cli/subcommands/snapshot_cmd.rs
Comment thread src/rpc/methods/chain.rs
@codecov

codecov Bot commented Jun 30, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 0% with 161 lines in your changes missing coverage. Please review.
✅ Project coverage is 64.07%. Comparing base (bfd4b96) to head (48429ef).
⚠️ Report is 1 commits behind head on main.
✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
src/rpc/methods/chain.rs 0.00% 124 Missing ⚠️
src/cli/subcommands/snapshot_cmd.rs 0.00% 23 Missing ⚠️
src/db/car/forest.rs 0.00% 10 Missing ⚠️
src/ipld/util.rs 0.00% 3 Missing ⚠️
src/tool/subcommands/archive_cmd.rs 0.00% 1 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
src/rpc/types/mod.rs 93.87% <ø> (ø)
src/tool/subcommands/archive_cmd.rs 29.66% <0.00%> (ø)
src/ipld/util.rs 53.68% <0.00%> (-0.48%) ⬇️
src/db/car/forest.rs 81.79% <0.00%> (-2.29%) ⬇️
src/cli/subcommands/snapshot_cmd.rs 0.00% <0.00%> (ø)
src/rpc/methods/chain.rs 58.54% <0.00%> (-1.50%) ⬇️

... and 13 files with indirect coverage changes


Continue to review full report in Codecov by Harness.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc4339c...48429ef. Read the comment docs.

🚀 New features to boost your workflow:
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@LesnyRumcajs

Copy link
Copy Markdown
Member

update snapshot checksum file extension from forest.car.sha256sum to forest.car.zst.sha256sum

That's a breaking change, no?

Comment thread src/db/car/forest.rs
.into_temp_path())
}

pub fn tmp_exporting_forest_car_path(output_path: &Path) -> PathBuf {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: let's add a simple test here. The code is kind of obvious but obvious things managed to bite us in the past. Plus, we want coverage to be high.

Comment thread src/ipld/util.rs
Comment on lines +90 to +92
if !self.cancellation_token.is_cancelled() {
self.cancellation_token.cancel();
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to check if it's cancelled? Might be a TOCTOU issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Snapshot Run snapshot tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants