Within hours I paused an ongoing Opus 4.7 benchmark, swapped the API keys, and ran the exact same methodology on ...
OpenAI Releases GPT-5.5, a Fully Retrained Agentic Model That Scores 82.7% on Terminal-Bench 2.0 and 84.9% on GDPval ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results