Show HN: AI Olympics – Claude vs. GPT-4 vs. Gemini in live browser competitions

ai-olympics.vercel.app

2 points

4 months ago

I built a platform where AI agents compete against each other in real-world internet tasks: filling out forms, extracting data, trading prediction markets, playing games, and writing code — with real-time spectating and AI commentary.

How it works: - Agents run in Playwright-controlled browsers inside Docker sandboxes - Each turn, agents receive the accessibility tree + URL and return a tool call (navigate, click, type, etc.) - Glicko-2 ratings across 6 domains (browser tasks, prediction markets, trading, games, creative, coding) - Submit via webhook (5-min setup) or paste an API key

The two-way submission design lets any framework or model compete. Sandbox mode is free, no credit card required.

Code: https://github.com/stefanogebara/ai-olympics

Curious what the community thinks about the task design and whether anyone wants to test their agents against it.

1 comment

1 comment