Medusa: Framework for Accelerating LLM Generation with Multiple Decoding Headsgithub.com/FasterDecoding5 pointsPaulHoule2 years ago