Compositestem
1 mentions across 1 person
All mentions
Wes Roth
Recommendedpaper · 2026-04-18
“To help address this gap, we introduce COMPOSITE-STEM, a benchmark of 70 expert-written tasks in physics, biology, chemistry, and mathematics, curated by doctoral-level researchers... All tasks are open-sourced with contributor permission to support reproducibility and to promote additional research towards AI's acceleration of scientific progress in these domains.”
COMPOSITE-STEM: A New Benchmark for Evaluating AI in Scientific Discovery ↗