Optimized half precision gemm assembly kernels on AMD Fiji for deep learninggithub.com/hyln92 pointshyln99 years ago