Hi all! I noticed a lot of folks having trouble running Dia2 TTS locally, either due to setup complexity or lack of a GPU. I put together a small wrapper that lets you run Dia2 without any local GPU by deploying it on Modal (serverless GPU compute).
The project exposes Dia2 as a simple HTTPS API. You can deploy it in a few minutes, send text with [S1]/[S2] speaker tags, and optionally upload short WAV samples to clone voices. Everything runs in the cloud; nothing GPU-related is required on your machine.
It’s meant to be a lightweight, practical way to try Dia2 or integrate it into other projects without dealing with local CUDA setup.