Outperforming larger language models with less training data and smaller modelsblog.research.google320 pointsatg_abhishek3 years ago