Search: huggingface.co | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

331.

4-Bit Quantization and QLoRA

3 years ago

41 points

332.

Qwen/Qwen3.6-27B · Hugging Face

2 months ago

41 points

333.

BLOOMChat, a 176B parameter, Multi-lingual, fine tuned chat

3 years ago

40 points

334.

What's Going on with the Open LLM Leaderboard?

3 years ago

40 points

335.

DeepSeek-R1-Distill-Qwen-1.5B Surpasses GPT-4o in certain benchmarks

a year ago

39 points

336.

Show HN: Deep Learning Personas

personas.huggingface.co

9 years ago

39 points

337.

Kai-Fu Li's Yi-34B uses exactly Llama's architecture except for 2 tensor renamed

3 years ago

vissidarte_choi

39 points

338.

Continuous batching (2025)

4 months ago

39 points

339.

Fully autonomous AI agents should not be developed

a year ago

38 points

340.

Zephyr 7B – Mistral Finetune that responds like ChatGPT

3 years ago

37 points

341.

Whisper Jax: Transcribe a 1 hour of audio in under 15 seconds

3 years ago

36 points

342.

Qwen3-235B-A22B-Instruct-2507

a year ago

36 points

343.

MistralLite by Amazon Web Services

3 years ago

34 points

344.

The Ultra-Scale Playbook: Training LLMs on GPU Clusters

a year ago

33 points

345.

Mixtral-8x22B on HuggingFace

2 years ago

33 points

346.

Qwen3-Coder-30B-A3B-Instruct

a year ago

32 points

347.

General OCR Theory: Towards OCR-2.0 via a Unified End-to-End Model

2 years ago

31 points

348.

Anatomy of BoltzGen

6 months ago

31 points

349.

Zephyr 141B, a Mixtral 8x22B fine-tune, is now available in Hugging Chat

2 years ago

30 points

350.

2 years ago

30 points

351.

Reachy Mini – The Open-Source Robot for Today's and Tomorrow's AI Builders

a year ago

30 points

352.

Mistral 7B v0.2

2 years ago

29 points

353.

Mixture of Experts Explained

3 years ago

29 points

354.

TinyLlama at 2T of 3T

3 years ago

29 points

355.

Video2Game: Real-Time, Interactive, Realistic Environment from a Single Video

2 years ago

28 points

356.

Real-Time Latent Consistency Model

3 years ago

27 points

357.

grok-2 on Hugging Face

10 months ago

27 points

358.

Language Modeling Is Compression

3 years ago

27 points

359.

Llama-3.2-3B-Instruct-uncensored

2 years ago

26 points

360.

DeepSeek-V4 Technical Report [pdf]

2 months ago

26 points