HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
61.
▲
Show HN: L88 – A Local RAG System on 8GB VRAM (Need Architecture Feedback)
3 comments
4 months ago
adithyadrdo
12 points
62.
▲
Helios: 14B open source video model, real time at 19.5fps, runs on 6GB VRAM
github.com/PKU-YuanGroup
discuss
3 months ago
steveharing1
6 points
63.
▲
Show HN: Recurser lib reduces GPT2-XL VRAM usage by 25% and runs it on Colab
github.com/max-ng
1 comment
3 years ago
homo_sapiens
5 points
64.
▲
Show HN: A Vaadin Algebra and Calculus Solver Built with AI Assistance
1 comment
4 months ago
bellaOxmyx
4 points
65.
▲
Show HN: AudioGhost AI – Run Meta's Sam-Audio on Consumer GPUs (4GB-6GB VRAM)
github.com/0x0funky
1 comment
6 months ago
0x0funky
3 points
66.
▲
Shimmy v1.7.0: Running 42B Moe Models on Consumer GPUs with 99.9% VRAM Reduction
github.com/Michael-A-Kuykendall
1 comment
8 months ago
MKuykendall
3 points
67.
▲
Grinder12: 0.96-Bit Lossless Streaming KV-Cache (16.55x VRAM Savings
github.com/ggml-org
discuss
a month ago
AMICLLC
3 points
68.
▲
Show HN: L88 – A Local RAG System on 8GB VRAM (Need Architecture Feedback)
github.com/Hundred-Trillion
discuss
4 months ago
adithyadrdo
3 points
69.
▲
Unsloth – Train LLMs 2x faster with 70% less VRAM
github.com/unslothai
discuss
6 months ago
jhack
3 points
70.
▲
Quansloth Using Google's Turboquant Breaks the "VRAM Wall" for Local LLMs
github.com/PacifAIst
1 comment
2 months ago
gunzfanatic
2 points
71.
▲
Show HN: I built a <400ms latency voice agent that runs on a 4gb vram GTX 1650"
github.com/pheonix-delta
1 comment
4 months ago
shubham-coder
2 points
72.
▲
Show HN: A Vaadin 24, Spring algebra calculator with dynamic variable buttons
1 comment
7 months ago
bellaOxmyx
2 points
73.
▲
Dead Simple Web UI for Training Flux LoRA with Low VRAM (12GB/16GB/20GB) Support
github.com/cocktailpeanut
discuss
2 years ago
cocktailpeanut
2 points
74.
▲
Show HN: Parakeet LLM Demo (378M param. 8GB VRAM)
discuss
2 years ago
razodactyl
2 points
75.
▲
Adjust VRAM/RAM Split on Apple Silicon
github.com/ggerganov
1 comment
3 years ago
tosh
1 points
76.
▲
VDPAU-to-VAAPI accelerates Flash video on Intel GFX
github.com/i-rinat
discuss
13 years ago
ddalex
1 points
77.
▲
2.3x KV Cache Compression at 32k Context – Cut VRAM Costs by 50%
github.com/Jamie2111
discuss
a month ago
JamieObala
1 points
78.
▲
Show HN: VAAK (Voice-Activated Autonomous-Knowledge-System)
github.com/ayushmaanbhav
discuss
5 months ago
ayushmaanbhav
1 points
79.
▲
Show HN: QKV Core – Run 7B LLMs on 4GB VRAM via surgical memory alignment
github.com/QKV-Core
discuss
6 months ago
broxytr
1 points
80.
▲
Super Merryo Trolls: An Adventure from the Days Before VRAM
github.com/GBirkel
discuss
2 years ago
vatys
1 points
81.
▲
Rust Wishlist: functions with keyword args, default args, varargs
github.com/rust-lang
discuss
6 years ago
nurettin
1 points
82.
▲
Show HN: Forge – Guardrails take an 8B model from 53% to 99% on agentic tasks
github.com/antoinezambelli
252 comments
a month ago
zambelli
687 points
83.
▲
Show HN: InvokeAI, an open source Stable Diffusion toolkit and WebUI
github.com/invoke-ai
102 comments
4 years ago
sophrocyne
414 points
84.
▲
Show HN: Duplicate 3 layers in a 24B LLM, logical deduction .22→.76. No training
github.com/alainnothere
80 comments
3 months ago
xlayn
265 points
85.
▲
Launch HN: Deepsilicon (YC S24) – Software and hardware for ternary transformers
79 comments
2 years ago
areddyyt
189 points
86.
▲
Tell HN: Please Stop Using Imgur
34 comments
4 years ago
MzHN
69 points
87.
▲
Launch HN: General Instinct (YC P26) – Frontier models on edge devices
16 comments
17 days ago
guanming0717
63 points
88.
▲
Show HN: ZSE – Open-source LLM inference engine with 3.9s cold starts
github.com/Zyora-Dev
9 comments
4 months ago
zyoralabs
58 points
89.
▲
Show HN: I built a RISC-V emulator that runs DOOM
github.com/lalitshankarch
4 comments
2 months ago
Flex247A
50 points
90.
▲
Show HN: Local task classifier and dispatcher on RTX 3080
github.com/resilientworkflowsentinel
2 comments
5 months ago
Shubham_Amb
26 points
More