📄 paper · by Yang et al.

Yang Et Al 2024

1 mentions across 0 people

All mentions

Unknown speaker

paper · 2026-04-21

Recommended

“Cross-benchmark validation on 18 models using MMLU with verbalized confidence and on external data from Yang et al. (2024) confirms the screen transfers across benchmarks and probe formats.”

Borrowing Clinical Psychometrics to Validate LLM Confidence Signals Before Use ↗