Summeval
1 mentions across 0 people
Visit ↗All mentions
Unknown speaker
Recommendedpaper · 2026-04-17
“We present a two-pronged diagnostic toolkit applied to SummEval”
Diagnosing LLM Judge Reliability with Conformal Prediction and Transitivity Anal ↗