Why SWE-bench Verified no longer measures frontier coding capabilitiesopenai.com10 pointstedsanders4 months ago