Show HN: Nvidia's CUDA libraries are generic and not optimized for LLM inferencegithub.com/Venkat28111 pointvenkat_28115 months ago