FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioninghazyresearch.stanford.edu1 pointtodsacerdoti3 years ago