GPT-5 is Good, Actually: The Agony and Ecstasy of Public Benchmarks

An attempt to explain why benchmarks are either bad or secret, and why the bar charts don’t matter so much.

August 17, 2025 · 17 min · 3488 words · Shane Caldwell