US startup Anthropic on Monday announced the launch of its new generative artificial intelligence model, Claude Sonnet 4.5, which it says is the ...
MMLU-Pro holds steady at 85.0, AIME 2025 slightly improves to 89.3, while GPQA-Diamond dips from 80.7 to 79.9. Coding and agent benchmarks tell a similar story, with Codeforces ratings rising from ...
A federal judge has agreed to temporarily suspend the Trump administration's plan to eliminate hundreds of jobs at the agency ...