I put Claude, GPT, Gemini, and Grok in an arena and let them fight it out. Each model gets the full game state and decides how to survive - move, attack, form alliances, betray. Every decision comes from the model's API, nothing is scripted.
First battle ran today. Gemini won by allying with GPT early, then backstabbing at the perfect moment. Claude tried to play it safe and got eliminated. They play very differently and it's fun to watch.
Stack is React + Canvas, Bun + Hono on the backend. No database — battle data is JSON committed to git. Each model talks through its native SDK (Anthropic, OpenAI, Google, xAI). A new battle runs automatically every day.