Search: github.com/kmoe | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

211.

OpenMoE – A family of open-sourced Mixture-of-Experts (MoE) LLMs

github.com/XueFuzhao

3 years ago

3 points

212.

Show HN: Tok/s on a 35B MoE model using a $100 AMD crypto APU and Vulkan

github.com/akandr

3 months ago

2 points

213.

Mistral: Light-weight library for mixture-of-experts (MoE) training

github.com/mistralai

3 years ago

2 points

214.

Show HN: Ported Cerebras REAP to MLX – Prune MoE Experts on a MacBook

github.com/egesabanci

21 days ago

2 points

215.

Live 204-node MoE visualization reveals emergent cognitive stratification

github.com/eriirfos-eng

a month ago

2 points

216.

Show HN: 35B MoE LLM and other models locally on an old AMD crypto APU (BC250)

github.com/akandr

3 months ago

2 points

217.

Dots.llm1: open-source MoE LLM with 142B total and 14B active parameters

github.com/rednote-hilab

a year ago

2 points

218.

Every Flop Counts: Scaling 300B Moe LLMs Without Premium GPUs [pdf]

github.com/inclusionAI

a year ago

2 points

219.

Lamini Memory Tuning: near-perfect fact recall via 1M-way MoE [pdf]

github.com/lamini-ai

2 years ago

2 points

220.

Show HN: SwiftLM – Qwen Chat on iPhone, 100B+ Moe on M5 Pro 64GB (Native Swift)

github.com/SharpAI

3 months ago

1 points

221.

DirectStorage LLM Weight Streaming: 4x faster loading, MoE expert streaming

github.com/kibbyd

4 months ago

1 points

222.

Micro-Expert-Router: Running Mixtral-Class Moe Models on NVMe SSDs Without a GPU

github.com/randyap8-wq

a month ago

1 points

223.

Why Gemma-4 26B MoE works in HuggingFace but breaks in prod inference engines

github.com/maeddesg

a month ago

1 points

224.

Has anyone else hit expert homogeneity collapse in small MoE models?

github.com/eriirfos-eng

a month ago

1 points

225.

ARCHE3-7B – Sparse Moe with SmartRouter and Foundation Curriculum Training

3 months ago

OpenSynapseLabs

1 points

226.

QuantumLeap: 2.3× faster MoE inference with intelligent expert caching

github.com/MartinCrespoC

3 months ago

1 points

227.

Show HN: Adaptive-K – Cut MoE inference costs 30-50% with entropy-guided routing

github.com/Gabrobals

5 months ago

Gabrielebalsamo

1 points

228.

Show HN: LLM Inference Performance Analytic Tool for Moe Models (DeepSeek/etc.)

github.com/kevinyuan

7 months ago

1 points

229.

DeepSeek-VL2: Moe Vision-Language Models for Advanced Multimodal Understanding [pdf]

github.com/deepseek-ai

2 years ago

1 points

230.

Aria: Open Multimodal Native Moe

github.com/rhymes-ai

2 years ago

1 points

231.

Yosoro – Moe Style Markdown NoteBook

github.com/IceEnd

8 years ago

1 points

232.

Show HN: Terminal-Bench-RL: Training long-horizon terminal agents with RL

github.com/Danau5tin

a year ago

125 points

233.

Show HN: Pica – Rust-based agentic AI infrastructure (open-source)

a year ago

63 points

234.

Launch HN: General Instinct (YC P26) – Frontier models on edge devices

18 days ago

63 points

235.

Show HN: LeanRL: Fast PyTorch RL with Torch.compile and CUDA Graphs

github.com/pytorch-labs

2 years ago

53 points

236.

Show HN: KTransformers–236B Model and 1M Context LLM Inference on Local Machines

github.com/kvcache-ai

2 years ago

20 points

237.

Show HN: Run 500B+ Parameter LLMs Locally on a Mac Mini

github.com/opengraviton

3 months ago

17 points

238.

Show HN: Lemonade: Run LLMs Locally with GPU and NPU Acceleration

github.com/lemonade-sdk

10 months ago

15 points

239.

Show HN: KTransformers:671B DeepSeek-R1 on a Single Machine-286 tokens/s Prefill

github.com/kvcache-ai

a year ago

14 points

240.

Show HN: OpenGraviton – Run 500B+ parameter models on a consumer Mac Mini

opengraviton.github.io

4 months ago

13 points