HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
91.
▲
Show HN: KTransformers–236B Model and 1M Context LLM Inference on Local Machines
github.com/kvcache-ai
3 comments
2 years ago
sssummer
20 points
92.
▲
Show HN: Demon – open-source real-time music diffusion engine, 25Hz local GPU
daydreamlive.github.io
13 comments
a month ago
ryanontheinside
17 points
93.
▲
Show HN: Finetune Llama-3.1 2x faster in a Colab
colab.research.google.com
2 comments
2 years ago
danielhanchen
16 points
94.
▲
Show HN: Salad, a distributed cloud for AI (like Airbnb for GPUs)
4 comments
2 years ago
bobjmiles
15 points
95.
▲
Show HN: KTransformers:671B DeepSeek-R1 on a Single Machine-286 tokens/s Prefill
github.com/kvcache-ai
discuss
a year ago
sssummer
14 points
96.
▲
Show HN: Willow Inference Server: Optimized ASR/TTS/LLM for Willow/WebRTC/REST
github.com/toverainc
13 comments
3 years ago
kkielhofner
13 points
97.
▲
Ask HN: Help me improve my C-like language, C3
7 comments
6 years ago
Nuoji
12 points
98.
▲
Show HN: Lightweight Llama3 Inference Engine – CUDA C
github.com/abhisheknair10
discuss
a year ago
abhisheknair10
12 points
99.
▲
Show HN: Automatic 1111, but as a Python Package
github.com/saketh12
discuss
2 years ago
saketh105
11 points
100.
▲
Show HN: Coderive – Iterating through 1 Quintillion Inside a Loop in just 50ms
github.com/DanexCodr
13 comments
6 months ago
DanexCodr
8 points
101.
▲
Show HN: onprem unstructured data extraction with 4 lines of code
github.com/NanoNets
discuss
a year ago
souvik3333
8 points
102.
▲
Show HN: Local GLaDOS
old.reddit.com
discuss
2 years ago
dnhkng
8 points
103.
▲
Show HN: WaveletLM – wavelet-based, attention-free model with O(n log n) scaling
github.com/ramongougis
1 comment
2 months ago
anarmorarm
7 points
104.
▲
Show HN: Serve 100 Large AI models on a single GPU with low impact to TTFT
github.com/leoheuler
1 comment
7 months ago
leonheuler
7 points
105.
▲
Show HN: Federation of robots collaboratively train an object manipulation model
github.com/adap
discuss
a year ago
jafermarq
7 points
106.
▲
Show HN: OS Megakernel that match M5 Max Tok/w at 2x the Throughput on RTX 3090
github.com/Luce-Org
1 comment
2 months ago
GreenGames
6 points
107.
▲
Show HN: Blink-Edit – Cursor-style next-edit predictions for Neovim (local LLMs)
github.com/BlinkResearchLabs
discuss
5 months ago
atemyipod
6 points
108.
▲
Show HN: I'm tired of my LLM bullshitting. So I fixed it
9 comments
5 months ago
BobbyLLM
5 points
109.
▲
Show HN: AI Council – multi-model deliberation that runs in the browser
github.com/prijak
1 comment
4 months ago
prijak
5 points
110.
▲
Show HN: I built an AI movie making and design engine in Rust
github.com/storytold
1 comment
5 months ago
echelon
5 points
111.
▲
Show HN: ClawMem – Open-source agent memory with SOTA local GPU retrieval
github.com/yoloshii
discuss
3 months ago
yoloshii
5 points
112.
▲
TinyTTS: Ultra-light English TTS (9M params, 20MB), 8x CPU, 67x GPU
discuss
4 months ago
letrghieu
5 points
113.
▲
Show HN: Clawbernetes – Replace kubectl with conversation (Rust)
github.com/clawbernetes
discuss
4 months ago
redclaw
5 points
114.
▲
Show HN: Open-source fine-tuning in a Colab notebook
colab.research.google.com
discuss
2 years ago
danielhanchen
5 points
115.
▲
Show HN: Self-hosted RAG with MCP support for OpenClaw
github.com/2dogsandanerd
2 comments
5 months ago
2dogsanerd
4 points
116.
▲
Show HN: Pile Programming Language
github.com/sixfootbeard
2 comments
3 years ago
jhhh
4 points
117.
▲
Show HN: NSED is public – Mixture-of-Models to Hit SOTA using self-hosted AI
github.com/peeramid-labs
discuss
4 months ago
t_peersky
4 points
118.
▲
Show HN: ArtCraft AI crafting engine, written in Rust
github.com/storytold
discuss
5 months ago
echelon
4 points
119.
▲
Show HN: HORenderer3: A C++ software renderer implementing OpenGL 3.3 pipeline
github.com/Hobanghann
discuss
5 months ago
zghdls
4 points
120.
▲
Run 35B LLMs on Dual Pascal GPUs with QLoRA
discuss
9 months ago
rickesh_tn
4 points
More