Automatically checkpoints, migrates, and resumes live GPU jobs across instances to increase throughput, reliability, and performance.