Inference at the edge: Efficient transformer model inference on-devicegithub.com/ggerganov3 pointslioeters3 years ago