absorb.md

Compositestem

1 mentions across 1 person

Wes Roth
paper · 2026-04-18
Recommended

To help address this gap, we introduce COMPOSITE-STEM, a benchmark of 70 expert-written tasks in physics, biology, chemistry, and mathematics, curated by doctoral-level researchers... All tasks are open-sourced with contributor permission to support reproducibility and to promote additional research towards AI's acceleration of scientific progress in these domains.

COMPOSITE-STEM: A New Benchmark for Evaluating AI in Scientific Discovery