Anthropic just made headlines with Claude Opus 4.5 the first AI to score above 80% on SWE-bench Verified, beating even the top human engineers on this coding benchmark.
It also outruns GPT-5.1 and Google Gemini 3 Pro in side-by-side tests for automated programming. Beyond scores, Opus 4.5 rolls out safer, longer-running agents and new workflows for enterprise. Is this the true arrival of AI that can out-code usand what does it mean for developers?