HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
631.
▲
Rcarmo/gte-go: Golang inference for the GTE Small embedding model
github.com/rcarmo
discuss
2 months ago
rcarmo
1 points
632.
▲
Show HN: JibarOS, a shared inference runtime for Android
github.com/Jibar-OS
discuss
2 months ago
rafaelvalle03
1 points
633.
▲
ORAC-NT MedChem Copilot that blocks synthetically infeasible molecules
github.com/Kretski
discuss
2 months ago
DREDREG
1 points
634.
▲
New ML inference language dropped today
github.com/m0at
discuss
3 months ago
sfffs
1 points
635.
▲
QuantumLeap: 2.3× faster MoE inference with intelligent expert caching
github.com/MartinCrespoC
discuss
3 months ago
ikharoz
1 points
636.
▲
Show HN: Mamba SSM in Rust – training and inference with custom CUDA kernels
github.com/silvermpx
discuss
3 months ago
silvermpx
1 points
637.
▲
Show HN: Go LLM inference with a Vulkan GPU back end that beats Ollama's CUDA
github.com/computerex
discuss
4 months ago
computerex
1 points
638.
▲
Speculative Speculative Decoding: Really, Really Fast LLM Inference
github.com/tanishqkumar
discuss
4 months ago
fizzbuzz07
1 points
639.
▲
Show HN: SAM 3 Inference on Modal in Under 10 Seconds
github.com/TheFloatingString
discuss
4 months ago
larryll
1 points
640.
▲
Show HN: oMLX – Native Mac inference server that persists KV cache to SSD
github.com/jundot
discuss
4 months ago
jundot
1 points
641.
▲
MicroGPT: train & inference in 243 lines of code
gist.github.com
discuss
4 months ago
RyanShook
1 points
642.
▲
MicroGPT - Train and inference a GPT in pure, dependency-free Python (200 lines)
gist.github.com
discuss
4 months ago
susam
1 points
643.
▲
Show HN: ARIA Protocol – P2P distributed 1-bit LLM inference at 120 tok/s on CPU
github.com/spmfrance-cloud
discuss
5 months ago
anthonymu
1 points
644.
▲
Show HN: ARIA – P2P distributed inference protocol for 1-bit LLMs on CPU
github.com/spmfrance-cloud
discuss
5 months ago
anthonymu
1 points
645.
▲
Show HN: Weed–Minimalist AI/ML inference and backprogation in the style of Qrack
github.com/vm6502q
discuss
5 months ago
wrathfulspatula
1 points
646.
▲
PowerInfer: Fast LLM Inference on a Consumer-Grade GPU
github.com/Tiiny-AI
discuss
5 months ago
oldfuture
1 points
647.
▲
High Performance LLM Inference Operator Library from Tencent
github.com/Tencent
discuss
5 months ago
polyrand
1 points
648.
▲
Show HN: ResourceAI – Local LLM inference optimized for consumer iGPUs
discuss
5 months ago
Fenix46
1 points
649.
▲
Show HN: VelinScript 3.0 – eine neue Sprache MIT bidirektionaler Type‑Inference
github.com/SkyliteDesign
discuss
5 months ago
SkyliteDesign
1 points
650.
▲
Fast_topk_batched: High-performance batched Top-K selection for CPU inference
github.com/RAZZULLIX
discuss
5 months ago
thunderbong
1 points
651.
▲
Show HN: Adaptive-K – Cut MoE inference costs 30-50% with entropy-guided routing
github.com/Gabrobals
discuss
5 months ago
Gabrielebalsamo
1 points
652.
▲
Inference-Time Constitutional AI
github.com/mdiskint
discuss
5 months ago
mdiskint37
1 points
653.
▲
WeDLM Reconciling Diff Lang Models with Std Causal Attention for Fast Inference
github.com/Tencent
discuss
6 months ago
LoveMortuus
1 points
654.
▲
Show HN: Binfer, an experimental LLM inference engine in TypeScript and CUDA
github.com/bwasti
discuss
6 months ago
brrrrrm
1 points
655.
▲
TileRT: Tile-Based Runtime for Ultra-Low-Latency LLM Inference
github.com/tile-ai
discuss
7 months ago
simonpure
1 points
656.
▲
Pure Go hardware accelerated local inference on VLMs using llama.cpp
github.com/hybridgroup
discuss
8 months ago
deadprogram
1 points
657.
▲
Show HN: Serverless platform for inference of time-series foundation models
faim.it.com
discuss
8 months ago
ChernovAndrei
1 points
658.
▲
LitServe: Build custom AI inference engines
github.com/Lightning-AI
discuss
8 months ago
wfalcon
1 points
659.
▲
Yzma = embedding+inference on VLM/LLM/SLM/TLM in pure Go using llama.cpp
github.com/hybridgroup
discuss
8 months ago
deadprogram
1 points
660.
▲
Build your own AI model inference engines
github.com/Lightning-AI
discuss
9 months ago
wfalcon
1 points
More