Search: github.com/tnfe | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

541.

Kubernetes-native distributed LLM inference framework

github.com/llm-d

a year ago

2 points

542.

Show HN: Contextual AI Document Parser – Infer hierarchy for long, complex docs

a year ago

2 points

543.

Lambda calculus - compiler, type inference, and evaluator in less than 100 LOC

gist.github.com

a year ago

2 points

544.

Protobuf-ts-types: zero-codegen TypeScript type inference from protobuf messages

github.com/nathanhleung

a year ago

2 points

545.

Eagle-3 Speculative Decoding for LLM Inference (5.6x speedup)

github.com/SafeAILab

a year ago

2 points

546.

Show HN: Kernel-level LLM inference via /dev/llm0

github.com/randombk

a year ago

2 points

547.

Rust Type Inference Broke with Update to Deranged Crate

github.com/jhpratt

a year ago

2 points

548.

DeepDive: In-Depth Decryption of LLMs Construction and Inference from Scratch

github.com/therealoliver

a year ago

2 points

549.

Show HN: OptiLLMBench – Test how inference optimization tricks scale up LLMs

a year ago

2 points

550.

Deepseek.cpp: CPU inference for the DeepSeek family of LLMs in pure C++

github.com/andrewkchan

a year ago

2 points

551.

Jlama: LLM Inference Engine for Java

github.com/tjake

a year ago

2 points

552.

Show HN: EmbedAnything – Rust Powered Inference, Ingestion and Indexing

github.com/StarlightSearch

2 years ago

2 points

553.

JetStream: Throughput+memory optimized engine for LLM inference on XLA devices

github.com/google

2 years ago

2 points

554.

Duck-Lisp: optional free-form parenthesis inference

github.com/oitzujoey

2 years ago

2 points

555.

Jlama – a modern LLM inference engine for Java

github.com/tjake

2 years ago

2 points

556.

A minimal Python implementation of Hindley-Milner type inference

github.com/ethe

2 years ago

2 points

557.

Cake: a Rust framework for distributed inference of large models like LLama3

github.com/evilsocket

2 years ago

2 points

558.

Instant ONNX export for ML inference

github.com/Quantco

2 years ago

2 points

559.

Show HN: Model Gateway – bridging your apps with LLM inference endpoints

github.com/modelgw

2 years ago

2 points

560.

Llama3 Inference in Pure Java

github.com/mukel

2 years ago

2 points

561.

llm.f90: LLM Inference in Fortran

github.com/rbitr

2 years ago

2 points

562.

SGLang: Fast and Expressive LLM Inference with RadixAttention for 5x Throughput

github.com/skypilot-org

2 years ago

2 points

563.

Inference of Mamba models in pure C

github.com/kroggen

2 years ago

2 points

564.

Mamba LLM Inference on CPU

github.com/rbitr

3 years ago

2 points

565.

Official PR Reveals the Inference Code for Mixtral 8x7B

github.com/vllm-project

3 years ago

2 points

566.

Stable-fast for SD inference: Faster than AITemplate, On par with TensorRT

github.com/chengzeyi

3 years ago

2 points

567.

DeepSpeed-FastGen: High-Throughput for LLMs via MII and DeepSpeed-Inference

github.com/microsoft

3 years ago

2 points

568.

Show HN: Llama2 inference in one file of pure OCaml

github.com/jackpeck

3 years ago

2 points

569.

Tairov/llama2.mojo: Inference Llama 2 in one file of pure

github.com/tairov

3 years ago

2 points

570.

Llama2 Inference in pure Mojo

github.com/tairov

3 years ago

2 points