News

In the Large Language Models (LLM) RL training, value-free methods like GRPO and DAPO have shown great effectiveness. The true potential lies in value-based methods, which allow more precise credit ...
Large language models are built on transformer architectures and power applications like chat, code generation, and search, but their growing scale with billions of parameters makes efficient ...
LLMs have revolutionized artificial intelligence, transforming various applications across industries. Autoregressive (AR) models dominate current text generation, with leading systems like GPT-4, ...
Robots are increasingly being developed for home environments, specifically to enable them to perform daily activities like cooking. These tasks involve a combination of visual interpretation, ...
Tactile sensing is a crucial modality for intelligent systems to perceive and interact with the physical world. The GelSight sensor and its variants have emerged as influential tactile technologies, ...
Large language models are often praised for their linguistic fluency, but a growing area of focus is enhancing their reasoning ability—especially in contexts where complex problem-solving is required.
LLMs have demonstrated strong general-purpose performance across various tasks, including mathematical reasoning and automation. However, they struggle in domain-specific applications where ...
Marine robotic platforms support various applications, including marine exploration, underwater infrastructure inspection, and ocean environment monitoring. While reliable perception systems enable ...
OpenAI’s GPT-4o represents a new milestone in multimodal AI: a single model capable of generating fluent text and high-quality images in the same output sequence. Unlike previous systems (e.g., ...
Large Multimodal Models (LMMs) have demonstrated remarkable capabilities when trained on extensive visual-text paired data, advancing multimodal understanding tasks significantly. However, these ...
Reinforcement Learning RL has become a widely used post-training method for LLMs, enhancing capabilities like human alignment, long-term reasoning, and adaptability. A major challenge, however, is ...
While the outputs of large language models (LLMs) appear coherent and useful, the underlying mechanisms guiding these behaviors remain largely unknown. As these models are increasingly deployed in ...