HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
1.
▲
Show HN: Evolved x86 AVX-512 kernels for NF4 LLM inference
github.com/Anuar81
discuss
4 months ago
Anuar81
2 points
2.
▲
Show HN: Run Qwen3-Next-80B on 8GB GPU at 1tok/2s throughput
github.com/Mega4alik
17 comments
9 months ago
anuarsh
123 points
3.
▲
Show HN: Run gpt-oss-20b on 8GB GPUs
github.com/Mega4alik
discuss
10 months ago
anuarsh
6 points
4.
▲
Show HN: oLLM – LLM Inference for large-context tasks on consumer GPUs
github.com/Mega4alik
7 comments
10 months ago
anuarsh
3 points
5.
▲
Show HN: Fine-tune Llama3-8B on 8GB GPU without quantization
github.com/Mega4alik
discuss
8 months ago
anuarsh
3 points