Speeding Up Transformer Training and Inference by Increasing Model Sizebair.berkeley.edu3 pointsbjourne6 years ago