My LLM optimization loop reward-hacked its own benchmark (and other lessons) [pdf]github.com/CodeReclaimers1 pointCodeReclaimersa month ago