Skip to main content

Fusion Snapshots

Fusion Snapshots enable checkpoint/restore functionality for Nextflow pipeline processes running on cloud Spot/preemptible instances. When a cloud provider reclaims an instance, Fusion Snapshots creates a checkpoint of the running process and restores it on a new instance, allowing the process to resume exactly where it left off.

Key benefits of Fusion Snapshots include:

  • Cost savings: Use Spot instances without risk of lost work.
  • Time efficiency: Resume from interruption point instead of restarting tasks.
  • Resource optimization: Avoid recomputing completed work.
  • Automatic operation: Your pipelines require no code changes.

Cloud provider support

Fusion Snapshots is available for the following cloud providers:

Incremental snapshots

Incremental snapshots optimize performance by capturing only changed memory pages between checkpoints. This reduces snapshot time and data transfer. Fusion Snapshots automatically perform incremental snapshots on x86_64 instances.

Key features of incremental snapshots include:

  • Pre-dumps: Captures only changed memory pages since the last checkpoint.
  • Full dumps: Complete process state captured periodically.
  • Automatic: Enabled by default, no configuration needed.
  • Efficient: Reduces checkpoint time and data transfer.

How Fusion Snapshots work

Fusion Snapshots use CRIU (Checkpoint Restore in Userspace) to capture the complete state of a running process, including:

  • Process memory
  • Open files and file descriptors
  • Process tree and relationships
  • Execution state

When the system detects a Spot instance interruption:

  1. The system freezes the process and creates a snapshot of its state.
  2. Snapshot data is kept in sync with remote object storage via Fusion.
  3. On a new instance, the process state is downloaded and restored.
  4. The process continues execution from the exact point it was interrupted.

Get started

To get started with your cloud provider, see: