Clever
1 mentions across 0 people
All mentions
Unknown speaker
Recommendedpaper · 2026-05-25
“we evaluate Claude Code in an agentic proving framework on CLEVER, a Lean 4 benchmark for verifiable code generation”
Agentic LLMs Are Outpacing Program Verification Benchmarks: 98% End-to-End Succe ↗