Better and Faster Large Language Models via Multi-Token Predictionarxiv.org302 pointsjasondavies2 years ago