Search: github.com/tnfe | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

631.

Rcarmo/gte-go: Golang inference for the GTE Small embedding model

github.com/rcarmo

2 months ago

1 points

632.

Show HN: JibarOS, a shared inference runtime for Android

github.com/Jibar-OS

2 months ago

1 points

633.

ORAC-NT MedChem Copilot that blocks synthetically infeasible molecules

github.com/Kretski

2 months ago

1 points

634.

New ML inference language dropped today

github.com/m0at

3 months ago

1 points

635.

QuantumLeap: 2.3× faster MoE inference with intelligent expert caching

github.com/MartinCrespoC

3 months ago

1 points

636.

Show HN: Mamba SSM in Rust – training and inference with custom CUDA kernels

github.com/silvermpx

3 months ago

1 points

637.

Show HN: Go LLM inference with a Vulkan GPU back end that beats Ollama's CUDA

github.com/computerex

4 months ago

1 points

638.

Speculative Speculative Decoding: Really, Really Fast LLM Inference

github.com/tanishqkumar

4 months ago

1 points

639.

Show HN: SAM 3 Inference on Modal in Under 10 Seconds

github.com/TheFloatingString

4 months ago

1 points

640.

Show HN: oMLX – Native Mac inference server that persists KV cache to SSD

github.com/jundot

4 months ago

1 points

641.

MicroGPT: train & inference in 243 lines of code

gist.github.com

4 months ago

1 points

642.

MicroGPT - Train and inference a GPT in pure, dependency-free Python (200 lines)

gist.github.com

4 months ago

1 points

643.

Show HN: ARIA Protocol – P2P distributed 1-bit LLM inference at 120 tok/s on CPU

github.com/spmfrance-cloud

5 months ago

1 points

644.

Show HN: ARIA – P2P distributed inference protocol for 1-bit LLMs on CPU

github.com/spmfrance-cloud

5 months ago

1 points

645.

Show HN: Weed–Minimalist AI/ML inference and backprogation in the style of Qrack

github.com/vm6502q

5 months ago

wrathfulspatula

1 points

646.

PowerInfer: Fast LLM Inference on a Consumer-Grade GPU

github.com/Tiiny-AI

5 months ago

1 points

647.

High Performance LLM Inference Operator Library from Tencent

github.com/Tencent

5 months ago

1 points

648.

Show HN: ResourceAI – Local LLM inference optimized for consumer iGPUs

5 months ago

1 points

649.

Show HN: VelinScript 3.0 – eine neue Sprache MIT bidirektionaler Type‑Inference

github.com/SkyliteDesign

5 months ago

1 points

650.

Fast_topk_batched: High-performance batched Top-K selection for CPU inference

github.com/RAZZULLIX

5 months ago

1 points

651.

Show HN: Adaptive-K – Cut MoE inference costs 30-50% with entropy-guided routing

github.com/Gabrobals

5 months ago

Gabrielebalsamo

1 points

652.

Inference-Time Constitutional AI

github.com/mdiskint

5 months ago

1 points

653.

WeDLM Reconciling Diff Lang Models with Std Causal Attention for Fast Inference

github.com/Tencent

6 months ago

1 points

654.

Show HN: Binfer, an experimental LLM inference engine in TypeScript and CUDA

github.com/bwasti

6 months ago

1 points

655.

TileRT: Tile-Based Runtime for Ultra-Low-Latency LLM Inference

github.com/tile-ai

7 months ago

1 points

656.

Pure Go hardware accelerated local inference on VLMs using llama.cpp

github.com/hybridgroup

8 months ago

1 points

657.

Show HN: Serverless platform for inference of time-series foundation models

8 months ago

1 points

658.

LitServe: Build custom AI inference engines

github.com/Lightning-AI

8 months ago

1 points

659.

Yzma = embedding+inference on VLM/LLM/SLM/TLM in pure Go using llama.cpp

github.com/hybridgroup

8 months ago

1 points

660.

Build your own AI model inference engines

github.com/Lightning-AI

9 months ago

1 points