7x speed improvement for LLaMA in less than 10 lines of codegithub.com/tinygrad2 pointshack_ml3 years ago