Search: huggingface.co | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

511.

Nvidia releases 8B model with learned 8x KV cache compression

5 months ago

9 points

512.

Nvidia releases weights for Llama-3.1-Nemotron-70B-Instruct

2 years ago

9 points

513.

Stable Diffusion XL Inpainting model released

3 years ago

9 points

514.

Opentensor and Cerebras announce BTLM-3B-8K, a leading 3B param. language model

3 years ago

9 points

515.

Spaces ZeroGPU: Dynamic GPU Allocation for Spaces

2 years ago

9 points

516.

Perspectives for first principles prompt engineering

2 years ago

9 points

517.

ConvLLaVA: Hierarchical Backbones as Visual Encoder for Large Multimodal Models

2 years ago

9 points

518.

Argilla released Notux 8x7B - DPO fine-tune of Mixtral 8x7B

2 years ago

9 points

519.

LLM Arena. Mistral-small best open model. Gemini Pro beaten by 2 open models

3 years ago

9 points

520.

Meta-llama (Meta Llama 2)

3 years ago

9 points

521.

Summary of the Tokenizers

3 years ago

9 points

522.

Show HN: Sentiment Analysis on Encrypted Data with Homomorphic Encryption

4 years ago

9 points

523.

RunwayML fine tuned Stable Diffusion 1.5 model

4 years ago

9 points

524.

DeepSeek: Thinking with Visual Primitives [pdf]

2 months ago

9 points

525.

The Smol Training Playbook: The Secrets to Building World-Class LLMs

8 months ago

9 points

526.

Show HN: A Transformer model that preserves logical equivalence

a year ago

9 points

527.

Mistral-Large-Instruct-2411 – advanced dense Large Language Model (LLM) 123B

2 years ago

9 points

528.

MIT Researchers Unveil New Method to Improve LLM Inference Performance

2 years ago

9 points

529.

Aryn/deformable-detr-DocLayNet – open-source Layout Model

2 years ago

9 points

530.

AIMO (AI Math Olympiad) progress prize winning solution

2 years ago

9 points

531.

Mistral-7B-v0.3 released on HuggingFace

2 years ago

9 points

532.

Microsoft Phi-3 3.8B model with 128k Context

2 years ago

9 points

533.

The Stack v2: a 3B files in 600 programming languages dataset

2 years ago

9 points

534.

DeepSeek-Prover-V2-671B

a year ago

8 points

535.

NousResearch/Nous-Hermes-2-Llama-2-70B

2 years ago

8 points

536.

Gradio-Lite: Serverless Gradio Running in the Browser

3 years ago

8 points

537.

Show HN: Parley: The RPG where you Negotiate with Bandits

3 years ago

8 points

538.

8 days ago

8 points

539.

Reka Edge – 7B fast, efficient VLM (open-weights)

3 months ago

8 points

540.

Z-Image Turbo Released – 6B Parameter Text to Image Model

7 months ago

8 points