Show HN: Bifrost – open-source LLM Gateway (50x lower latency than LiteLLM)

3 points

6 months ago

We built Bifrost because we found existing Python-based gateways struggled with high concurrency in production. We wanted something that treated LLM infra like high-availability software.

We ran side-by-side benchmarks against LiteLLM on a single t3.medium instance (using a mock LLM with 1.5s fixed latency) to test pure gateway overhead.

The Results:

p99 Latency: 90.72s (LiteLLM) vs 1.68s (Bifrost)

Throughput: 44 req/sec vs 424 req/sec

Memory: ~3x lighter usage in Go.

It’s a drop-in replacement (OpenAI compatible) designed for teams needing semantic caching, failover, and observability without the overhead.

We’d love to hear your feedback.

No comments

No comments