I wrote a minimal PyTorch FSDP to understand how it works (~240 LOC)github.com/0xNaN1 pointxnan5 months ago