HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
1.
▲
Categorical Foundations for Cute Layouts
research.colfax-intl.com
6 comments
9 months ago
charles_irl
39 points
2.
▲
NVFP4 Blockscaled GEMM on NVIDIA RTX Pro Blackwell GPUs (SM12x)
research.colfax-intl.com
discuss
16 hours ago
matt_d
4 points
3.
▲
FlashAttention-4: Algorithm and Kernel Pipelining Co-Design
research.colfax-intl.com
discuss
3 days ago
skidrow
2 points
4.
▲
Cutlass Tutorial: Efficient GEMM Kernel Designs with Pipelining
research.colfax-intl.com
discuss
3 days ago
skidrow
2 points
5.
▲
Dynamic Persistent Tile Scheduling w/ Cluster Launch Control (CLC) on Blackwell
research.colfax-intl.com
discuss
a month ago
matt_d
2 points
6.
▲
FlashAttention-4
research.colfax-intl.com
discuss
4 months ago
maralom
2 points
7.
▲
Cutlass Tutorial: Sub-Byte GEMM on Nvidia Blackwell GPUs
research.colfax-intl.com
discuss
a year ago
jxmorris12
2 points
8.
▲
GEMM with Thread Block Clusters on Nvidia Blackwell GPUs
research.colfax-intl.com
discuss
a year ago
ashvardanian
2 points
9.
▲
Cutlass Tutorial: Writing GEMM Kernels Using Tensor Memory for Blackwell GPUs
research.colfax-intl.com
discuss
a year ago
ashvardanian
2 points
10.
▲
DeepSeek-R1 and FP8 Mixed-Precision Training
research.colfax-intl.com
discuss
a year ago
skidrow
2 points
11.
▲
DeepSeek-R1 and FP8 Mixed-Precision Training
research.colfax-intl.com
discuss
a year ago
skidrow
2 points
12.
▲
Categorical Foundations for CuTe Layouts
research.colfax-intl.com
discuss
9 months ago
matt_d
1 points
13.
▲
DeepSeek-R1 and FP8 Mixed-Precision Training
research.colfax-intl.com
discuss
a year ago
skidrow
1 points
14.
▲
Cutlass Tutorial: Fast Matrix-Multiplication with Wgmma on Nvidia Hopper GPUs
research.colfax-intl.com
discuss
2 years ago
sebg
1 points