HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
1.
▲
Compiling LLMs into a MegaKernel: A path to low-latency inference
zhihaojia.medium.com
76 comments
a year ago
matt_d
314 points