Search: github.com/tnfe | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

331.

Pipeline-parallel LLM inference across GPUs on separate machines

github.com/leyten

4 days ago

5 points

332.

Show HN: FlashQwen – A from-scratch CUDA inference engine for Qwen3

8 days ago

5 points

333.

AI Agent that at inference time updates it's harness and model weights

github.com/hexo-ai

23 days ago

5 points

334.

Show HN: Smile-Serve – Inference Server for ML, ONNX, and LLM

github.com/haifengl

2 months ago

5 points

335.

vLLM introduces memory optimizations for long-context inference

github.com/vllm-project

3 months ago

5 points

336.

Zinc – LLM inference engine written in Zig, running 35B models on $550 AMD GPUs

github.com/zolotukhin

3 months ago

5 points

337.

Show HN: Llmtop – Htop for LLM Inference Clusters (vLLM, SGLang, Ollama, llama)

github.com/InfraWhisperer

3 months ago

5 points

338.

MetalChat – Llama Inference for Apple Silicone

github.com/ybubnov

4 months ago

5 points

339.

Voxtral.c Voxtral Realtime 4B model inference as a C library

github.com/antirez

5 months ago

5 points

340.

llama2.zig: Inference Llama 2 in one file of pure Zig

github.com/cgbur

7 months ago

5 points

341.

T-Mac: Low-bit LLM inference on CPU/NPU with lookup table

github.com/microsoft

9 months ago

5 points

342.

Show HN: gline-rs – an inference engine for GLiNER models, in Rust

github.com/fbilhaut

a year ago

5 points

343.

Fast LLM Inference in Rust

github.com/EricLBuehler

2 years ago

5 points

344.

Fast and hackable PyTorch native transformer inference

github.com/pytorch-labs

3 years ago

5 points

345.

Lepton: An open-source library (Apache 2.0) for scaling model inference

github.com/leptonai

3 years ago

5 points

346.

Run LLaMA Inference on CPU, with Rust

github.com/rustformers

3 years ago

5 points

347.

Three-processor inference on AMD Ryzen AI 300

github.com/Peterc3-dev

3 months ago

4 points

348.

LangPatrol: A static analyzer for LLM prompts that catches bugs before inference

github.com/langpatrol

6 months ago

4 points

349.

Show HN: Inference Mixtral 8x7B in pure Rust

github.com/moritztng

2 years ago

4 points

350.

Show HN: Ggml.js – Serverless AI Inference on Browser with Web Assembly

rahuldshetty.github.io

3 years ago

4 points

351.

TensorSharp: Open-Source Local LLM Inference Engine

github.com/zhongkaifu

20 days ago

4 points

352.

Train and inference GPT in 243 lines of pure, dependency-free Python by Karpathy

gist.github.com

4 months ago

4 points

353.

PasLLM: An Object Pascal inference engine for LLM models

github.com/BeRo1985

7 months ago

4 points

354.

Distributed-Llama: Connect home devices into a cluster for LLM inference

github.com/b4rtaz

a year ago

4 points

355.

Practical Llama 3 inference in Java

github.com/mukel

2 years ago

4 points

356.

Llama.cpp speculative sampling: 2x faster inference for large models

github.com/ggerganov

3 years ago

4 points

357.

Zig GPT-2 inference engine

github.com/EugenHotaj

3 years ago

4 points

358.

Stable Diffusion inference locally on iOS / macOS using MPSGraph

github.com/mortenjust

4 years ago

4 points

359.

Pytype checks and infers types for your Python code

github.com/google

7 years ago

4 points

360.

Inferential database seeding in Clojure

michaeldrogalis.github.com

14 years ago

MichaelDrogalis

4 points