🛠 tool

Simplerl

1 mentions across 0 people

All mentions

Unknown speaker

paper · 2026-04-27

Warned against

“We present failure cases of symbolic evaluation in two popular frameworks, Lighteval and SimpleRL, and compare them to our approach, demonstrating clear improvements over commonly used methods.”

LLM-as-a-Judge for Math Reasoning Evaluation ↗