🛠 tool

Group Relative Policy Optimization Grpo

1 mentions across 0 people

All mentions

Unknown speaker

paper · 2026-04-10

Mixed

“We systematically study this phenomenon across seven challenging real-world spatial reasoning benchmarks and find that it affects contemporary MRMs such as ViGoRL-Spatial, TreeVGR as well as our own models trained with standard Group Relative Policy Optimization (GRPO).”

Faithful GRPO: Enhancing Visual Spatial Reasoning in Multimodal Language Models ↗