Sinkhorn: Make LLMs even smaller through quantisation while maintaining accuracygithub.com/huawei-csl4 pointsilitirit9 months ago