CUTLASS, a CUDA C++ template library for matrix multiply on GPUsdevblogs.nvidia.com7 pointskerrmudgeon9 years ago