Many Benchmark Scores Would Appear Much Less Impressive If They Simply Used More Tokens
1 mentions across 1 person
Visit ↗All mentions
“Benchmark performance is actually limited by token usage. https://joelbkr.substack.com/p/many-benchmarks-scores-would-appear?r=i5f7&utm_medium=ios&triedRedirect=true”
Large Language Models Exhibit Continued Performance Scaling with Increased Token ↗