Quantized Llama models with increased speed and a reduced memory footprintai.meta.com508 pointsegnehots2 years ago