prompt-to-proof is an open-source toolkit to
(1) measure LLM streaming latency and throughput and
(2) run a small, reproducible code eval, with hash-chained receipts you can verify. It targets OpenAI-style /chat/completions (works with OpenAI or local vLLM/llama.cpp).