DuoAttention-Slashes memory and latency for LLMs without sacrificing performance

Heykuki News

2 points

2 years ago

No comments

Threaded

Loading comments...

DuoAttention-Slashes memory and latency for LLMs without sacrificing performance | Heykuki News