High-Speed Large Language Model Serving on PCs with Consumer-Grade GPUsgithub.com/SJTU-IPADS417 pointsdataminer3 years ago