🛠 tool · by NBUS AI R&D team

Swe Rebench

1 mentions across 1 person

All mentions

youtube · 2026-04-10

Recommended

“If you are using certain open- source models, maybe they may not have the same performance as Opus 4.5 or GPT 5.4 or GPT 5.3 codecs. And you should take that into account because remember every single company, not just the Chinese open source companies are incentivized to inflate their benchmarks so that they can get more attention, more downloads, more publicity. So please do bear that in mind. Always run your own benchmarks, run your own tests, test it for yourself and you see how well it perf”

Chinese AI Models Lag Western Counterparts in Novel Reasoning and Unseen Tasks ↗