Speeding Up Transformer Training and Inference by Increasing Model Sizebair.berkeley.edu2 pointsjonbaer6 years ago