Lighteval
1 mentions across 0 people
All mentions
Unknown speaker
Warned againstpaper · 2026-04-27
“We present failure cases of symbolic evaluation in two popular frameworks, Lighteval and SimpleRL, and compare them to our approach, demonstrating clear improvements over commonly used methods.”
LLM-as-a-Judge for Math Reasoning Evaluation ↗