Show HN: TurboPrefill – Multi-GPU prefill acceleration for llama.cpp

Heykuki News

2 points

19 days ago

TurboPrefill is an attempt to make layer-split multi-GPU configurations spend less time waiting and more time computing during prefill.

No comments

Threaded

Loading comments...

Show HN: TurboPrefill – Multi-GPU prefill acceleration for llama.cpp | Heykuki News