HK
Heykuki News
Top
New
Best
Ask
Show
Jobs
Toggle theme
Top
New
Best
Ask
Show
Jobs
Request
511.
▲
Openpi-flash: Real-time inference engine for openpi
github.com/Hebbian-Robotics
discuss
2 months ago
kstonekuan
2 points
512.
▲
Show HN: Stateful Inference with 99% Token Savings
github.com/umbecanessa
discuss
2 months ago
wasnaga
2 points
513.
▲
Rcarmo/go-AI: A mildly sane inference API library for go
github.com/rcarmo
discuss
2 months ago
rcarmo
2 points
514.
▲
Show HN: Mimikos – Zero-config mock server that infers API behavior from OpenAPI
discuss
2 months ago
codeguruking
2 points
515.
▲
Kubernetes operator for deploying, serving, and improve LLM inference engines
github.com/cliver-project
discuss
2 months ago
LaSombra
2 points
516.
▲
Living Memory Inference
github.com/alash3al
discuss
2 months ago
alash3al
2 points
517.
▲
Swift package AI inference engine generated from Rust crate
github.com/ondeinference
discuss
3 months ago
kampak212
2 points
518.
▲
Open-source ZK proofs for ML inference – verify AI decisions cryptographically
github.com/OE-GOD
discuss
3 months ago
OE-GOD
2 points
519.
▲
AirLLM optimizes inference memory usage
github.com/lyogavin
discuss
4 months ago
nreece
2 points
520.
▲
Show HN: I wrote an LLM inference engine in pure Go – 48 tok/s zero dependencies
github.com/computerex
discuss
4 months ago
computerex
2 points
521.
▲
Show HN: Name-classifier – infers attributes about a person from a name
github.com/douglas-larocca
discuss
4 months ago
defgeneric
2 points
522.
▲
C inference for Qwen3-ASR 0.6B and 1.7B transcriptions models
github.com/antirez
discuss
4 months ago
Curiositry
2 points
523.
▲
Show HN: I built a unified inference layer for Document Processing Models
github.com/adithya-s-k
discuss
4 months ago
Adithya-Kolavi
2 points
524.
▲
Show HN: Evolved x86 AVX-512 kernels for NF4 LLM inference
github.com/Anuar81
discuss
4 months ago
Anuar81
2 points
525.
▲
OMLX – Ollama for MLX (LLM Inference Server for Apple Silicon)
github.com/jundot
discuss
4 months ago
fintechie
2 points
526.
▲
Show HN: Omni-NLI – A multi-interface server for natural language inference
discuss
5 months ago
habedi0
2 points
527.
▲
Show HN: EmbodIOS – AI Operating System with Kernel-Level Inference
github.com/dddimcha
discuss
5 months ago
dddimcha
2 points
528.
▲
Rig: Distributed LLM inference across machines in Rust
github.com/buyukakyuz
discuss
5 months ago
corrode2711
2 points
529.
▲
Tract: Self-contained, TensorFlow and ONNX inference
github.com/sonos
discuss
5 months ago
vishnukvmd
2 points
530.
▲
EmbodIOS - AI inference as the operating system (3.5s cold start)
github.com/dddimcha
discuss
5 months ago
dddimcha
2 points
531.
▲
HF-mem: CLI to estimate inference memory requirements for Hugging Face models
github.com/alvarobartt
discuss
6 months ago
handfuloflight
2 points
532.
▲
Mini-SGLang: A lightweight yet high-performance inference framework for LLM
github.com/sgl-project
discuss
6 months ago
limoce
2 points
533.
▲
Go apps can directly integrate llama.cpp for HW accelerated local inference
github.com/hybridgroup
discuss
7 months ago
deadprogram
2 points
534.
▲
Show HN: Olla – Lightweight LLM Proxy for Homelab and OnPrem AI Inference
discuss
10 months ago
thushanfernando
2 points
535.
▲
WebAssembly binding for llama.cpp – Enabling on-browser LLM inference
github.com/ngxson
discuss
a year ago
selvan
2 points
536.
▲
Show HN: Dwani.ai – multimodal inference API for Indian languages
dwani.ai
discuss
a year ago
gaganyatri
2 points
537.
▲
GPT4Free: "educational project" for free LLM inference from various services
github.com/xtekky
discuss
a year ago
bobbiechen
2 points
538.
▲
OmniPainter: Training-Free Stylized Text-to-Image Generation with Fast Inference
github.com/maxin-cn
discuss
a year ago
dvrp
2 points
539.
▲
GPU-enabled Llama 3 inference in Java from scratch
github.com/beehive-lab
discuss
a year ago
mikepapadim
2 points
540.
▲
BitNet 1.58bit GPU Inference Kernel
github.com/microsoft
discuss
a year ago
galeos
2 points
More