Fusion Snapshots
Fusion Snapshots enable checkpoint/restore functionality for Nextflow pipeline processes running on cloud Spot/preemptible instances. When a cloud provider reclaims an instance, Fusion Snapshots creates a checkpoint of the running process and restores it on a new instance, allowing the process to resume exactly where it left off.
Key benefits of Fusion Snapshots include:
- Cost savings: Use Spot instances without risk of lost work.
- Time efficiency: Resume from interruption point instead of restarting tasks.
- Resource optimization: Avoid recomputing completed work.
- Automatic operation: Your pipelines require no code changes.
Cloud provider support
Fusion Snapshots is available for the following cloud providers:
- AWS Batch with Spot instances: 120-second guaranteed reclamation window.
- Google Batch with preemptible instances: Up to 30-second reclamation window.
Incremental snapshots
Incremental snapshots optimize performance by capturing only changed memory pages between checkpoints. This reduces snapshot time and data transfer. Fusion Snapshots automatically perform incremental snapshots on x86_64 instances.
Key features of incremental snapshots include:
- Pre-dumps: Captures only changed memory pages since the last checkpoint.
- Full dumps: Complete process state captured periodically.
- Automatic: Enabled by default, no configuration needed.
- Efficient: Reduces checkpoint time and data transfer.
How Fusion Snapshots work
Fusion Snapshots use CRIU (Checkpoint Restore in Userspace) to capture the complete state of a running process, including:
- Process memory
- Open files and file descriptors
- Process tree and relationships
- Execution state
When the system detects a Spot instance interruption:
- The system freezes the process and creates a snapshot of its state.
- Snapshot data is kept in sync with remote object storage via Fusion.
- On a new instance, the process state is downloaded and restored.
- The process continues execution from the exact point it was interrupted.
Get started
To get started with your cloud provider, see: