Gpqa
1 mentions across 1 person
Visit ↗All mentions
“We evaluate model performance on GPQA (Rein et al. 2024)”
Prompting LLMs with Threats or Tips Shows Limited Efficacy ↗1 mentions across 1 person
Visit ↗“We evaluate model performance on GPQA (Rein et al. 2024)”
Prompting LLMs with Threats or Tips Shows Limited Efficacy ↗