A Recipe for Training Large Models using 2nd Order Methods (Distributed Shampoo)wandb.ai2 pointstim_sw3 years ago