GPU Interposing and Performance

11% inference throughput and 10x faster cold starts, with minimal overhead!
May 29, 2025