Benchmark Model - Search News

Google’s Latest Gemini 3.1 Pro Model Is a Benchmark Beast

Google just released its most capable Gemini 3.1 Pro AI model that beats all frontier models on Humanity's Last Exam and ...

Google debuts Gemini 3.1 Pro: New frontier model sets benchmark records

Google has unveiled Gemini 3.1 Pro, an upgraded AI model that outperforms its predecessor and competitors on major logic and ...

Decrypt

OpenAI Says Benchmark Used to Measure AI Coding Skill Is 'Contaminated'—Here's Why

OpenAI wants to retire the leading AI coding benchmark—and the reasons reveal a deeper problem with how the whole industry measures itself.

9don MSN

Google releases Gemini 3.1 Pro: Benchmark performance, how to try it

Google says that its most advanced thinking model yet outperforms Claude and ChatGPT on Humanity's Last Exam and other key ...

VentureBeat

Microsoft’s GRIN-MoE AI model takes on coding and math, beating competitors in key benchmarks

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Microsoft has unveiled a groundbreaking artificial intelligence model, ...

Business Wire

New MLPerf Training and HPC Benchmark Results Showcase 49X Performance Gains in 5 Years

SAN FRANCISCO--(BUSINESS WIRE)--Today, MLCommons® announced new results from two industry-standard MLPerf™ benchmark suites: MLPerf Training v3.1 The MLPerf Training benchmark suite comprises full ...

11don MSN

Anthropic releases Claude Sonnet 4.6: Benchmark performance, how to try it

Anthropic's latest flagship model, Claude Sonnet 4.6, is out now.

insideHPC

MLPerf Training and HPC Benchmark Show 49X Performance Gains in 5 Years

Today, MLCommons announced new results from two MLPerf benchmark suites: the MLPerf Training v3.1 suite, which measures the performance of training machine learning models; and the MLPerf HPC v.3.0 ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results