HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
skidrow
Born on July 02, 2024
•
388 Karma
About
Submitted
Comments
Favorites
Request
1.
Occupancy Math on the AMD MI355X: A From-First-Principles Guide
indianspeedster.github.io
discuss
11 hours ago
skidrow
2 points
2.
Computer Vision – Lecture 1.1 (Introduction: Organization) [video]
youtube.com
discuss
2 days ago
skidrow
2 points
3.
FlashAttention-4: Algorithm and Kernel Pipelining Co-Design
colfax-intl.com
discuss
2 days ago
skidrow
2 points
4.
Cutlass Tutorial: Efficient GEMM Kernel Designs with Pipelining
colfax-intl.com
discuss
2 days ago
skidrow
2 points
5.
Toward Better Hip Kernel Generation for AMD GPUs
stanford.edu
discuss
2 days ago
skidrow
2 points
6.
FP8 GEMM Optimization on AMD CDNA4 Architecture
amd.com
discuss
2 days ago
skidrow
2 points
7.
Occupancy Math on the AMD MI355X: A From-First-Principles Guide
indianspeedster.github.io
discuss
2 days ago
skidrow
2 points
8.
FP8 GEMM Optimization on AMD CDNA4 Architecture
amd.com
discuss
3 days ago
skidrow
1 points
9.
Occupancy Math on the AMD MI355X
indianspeedster.github.io
discuss
3 days ago
skidrow
1 points
10.
FP8 GEMM Optimization on AMD CDNA4 Architecture
amd.com
discuss
3 days ago
skidrow
1 points
11.
Occupancy Math on the AMD MI355X: A From-First-Principles Guide
indianspeedster.github.io
discuss
3 days ago
skidrow
1 points
12.
FP8 GEMM Optimization on AMD CDNA4 Architecture
amd.com
discuss
4 days ago
skidrow
4 points
13.
Occupancy Math on the AMD MI355X: A From-First-Principles Guide
indianspeedster.github.io
4 comments
4 days ago
skidrow
44 points
14.
FP8 GEMM Optimization on AMD CDNA4 Architecture
amd.com
discuss
5 days ago
skidrow
3 points
15.
Deep Dive into 4-Wave Interleave FP8 GEMM
amd.com
discuss
5 days ago
skidrow
3 points
16.
Occupancy Math on the AMD MI355X: A From-First-Principles Guide
indianspeedster.github.io
discuss
5 days ago
skidrow
3 points
17.
Creating custom kernels for the AMD MI300
huggingface.co
8 months ago
skidrow
2 points
18.
Implementing a Fast Tensor Core Matmul on the Ada Architecture
spatters.ca
8 months ago
skidrow
4 points
19.
Matrix Core Programming on AMD GPUs
salykova.github.io
5 comments
8 months ago
skidrow
116 points
20.
Implementing a Fast Tensor Core Matmul on the Ada Architecture
spatters.ca
8 months ago
skidrow
3 points
21.
Matrix Core Programming on AMD GPUs
salykova.github.io
8 months ago
skidrow
2 points
22.
Creating custom kernels for the AMD MI300
huggingface.co
8 months ago
skidrow
1 points
23.
Implementing a Fast Tensor Core Matmul on the Ada Architecture
spatters.ca
8 months ago
skidrow
2 points
24.
Matrix Core Programming on AMD CDNA3 and CDNA4 Architecture
salykova.github.io
3 comments
8 months ago
skidrow
24 points
25.
Creating custom kernels for the AMD MI300
huggingface.co
8 months ago
skidrow
2 points
26.
Implementing a Fast Tensor Core Matmul on the Ada Architecture
spatters.ca
8 months ago
skidrow
2 points
27.
Advanced Matrix Multiplication Optimization on Multi-Core Processors (2024)
salykova.github.io
3 comments
8 months ago
skidrow
85 points
28.
Creating custom kernels for the AMD MI300
huggingface.co
8 months ago
skidrow
2 points
29.
Introduction to Matrix Core Programming on AMD CDNA3 and CDNA4 Architecture
salykova.github.io
8 months ago
skidrow
2 points
30.
Creating custom kernels for the AMD MI300
huggingface.co
11 months ago
skidrow
2 points
More