Large-Scale Distributed ML Checkpointing System Interview Questions | Design | InterviewGPT