SWE-bench Verified no longer measures frontier coding capabilitiesopenai.com343 pointskmdupree2 months ago