Search: baseten.co | Heykuki News

HK

Heykuki News

Top New Best Ask Show Jobs

Top New Best Ask Show Jobs

1.

High performance client for Baseten.co

github.com/basetenlabs

a year ago

7 points

2.

Show HN: Baseten – Build ML-powered applications

4 years ago

112 points

3.

Show HN: ChatLLaMA – A ChatGPT style chatbot for Facebook's LLaMA

chatllama.baseten.co

3 years ago

402 points

4.

Running GPT-OSS-120B at 500 tokens per second on Nvidia GPUs

10 months ago

247 points

5.

A guide to open-source LLM inference and performance

3 years ago

113 points

6.

DALL-E Mini – Generate images from a text prompt

4 years ago

52 points

7.

How we got Stable Diffusion XL inference to under 2 seconds

3 years ago

51 points

8.

Show HN: Free Stable Diffusion 2.0 hosted interface

4 years ago

25 points

9.

BaseTen: The fastest way to build ML-powered applications

5 years ago

20 points

10.

Show HN: Fine-tune generative models in 1 line of code

blueprint.baseten.co

3 years ago

16 points

11.

Show HN: Baseten Chains – Framework and SDK for Multi-Model AI Products

2 years ago

mikejulietbravo

9 points

12.

The Math Behind TurboQuant

3 months ago

8 points

13.

Hosted Stable Diffusion Demo

4 years ago

7 points

14.

Serving four million Riffusion requests in two days

4 years ago

5 points

15.

Try it yourself: Speech to text with Whisper

4 years ago

5 points

16.

How BaseTen is using “docs as code”

blog.baseten.co

4 years ago

5 points

17.

SDXL inference in under 2 seconds

3 years ago

3 points

18.

How We Built the Fastest Kimi K2.5 on Artificial Analysis

4 months ago

3 points

19.

Deploying Stable Diffusion in Production Using Truss

4 years ago

3 points

20.

Open Source Inference Engine Baseten Raises $40M from IVP, Spark and Greylock

2 years ago

mikejulietbravo

2 points

21.

Faster Mixtral inference with TensorRT-LLM and quantization

2 years ago

2 points

22.

Inference Engineering

4 months ago

2 points

23.

Show HN: Inference Engineering

4 months ago

2 points

24.

How to double tokens per second for Llama 3 with Medusa

2 years ago

2 points

25.

Show HN: Automatically Build Nvidia TRT-LLM Engines

2 years ago

mikejulietbravo

2 points

26.

FP8: Efficient model inference with 8-bit floating point numbers

2 years ago

2 points

27.

Code generation interactive demo (Salesforce Codegen mono 2B)

4 years ago

2 points

28.

Working at an early-stage company as an early-stage engineer

blog.baseten.co

5 years ago

2 points

29.

Inferless Joins Baseten

4 months ago

1 points

30.

Continual learning and the post monolith AI era

4 months ago

1 points