NanoDO: A minimal Transformer decoder-only language model implementationgithub.com/google-deepmind12 pointsyeldarb2 years ago