Reinforcement Learning Course

News

Microsoft’s new AI framework trains powerful reasoning models with a fraction of the cost

The rStar2-Agent framework boosts a 14B model to outperform a 671B giant, offering a path to state-of-the-art AI without ...

SJTU and ByteDance Join Forces to Launch RhymeRL: 2.6x Improvement in Reinforcement Learning Training Speed!

This similarity primarily arises from mainstream RL algorithms such as PPO/GRPO, which use gradient clipping mechanisms to ensure training stability. This mechanism smooths the model's evolutionary ...

19h

Conquering the 'Slowest Link' in Reinforcement Learning! Joint Efforts of Shanghai Jiao Tong University and ByteDance Boost RL Training Speed by 2.6 Times

However, behind this competition, a huge bottleneck quietly limits the speed of all players—compared to pre-training and ...

11d

CoreWeave to Acquire OpenPipe, Leader in Reinforcement Learning

CoreWeave, Inc. (NASDAQ: CRWV), the AI Hyperscaler™, today announced a definitive agreement to acquire OpenPipe Inc, a ...

The American Bazaar3d

Inside Thinking Machines Lab: Murati’s $12 billion AI startup tackles reproducibility

Mira Murati’s Thinking Machines Lab debuts with a $2B-backed project tackling nondeterminism in AI, aiming to deliver ...

Geeky Gadgets4mon

Why Reinforcement Learning Could Be AI’s Biggest Flaw Yet

What if the very techniques we rely on to make AI smarter are actually holding it back? A new study has sent shockwaves through the AI community by challenging the long-held belief that reinforcement ...

11don MSN

CoreWeave acquires agent-training startup OpenPipe

CoreWeave hopes the YC-backed startup will help it expand up the stack and cash in on enterprises developing AI agents.

Results that may be inaccessible to you are currently showing.

Hide inaccessible results