C
Cooper @peakcooper
Friday, November 28, 2025 import

Tweet

Claude Opus 4.5: full review This is the best model release in a long long time when it comes to programming. It blows my mind how good it is. I have not seen this big of an improvement since the original release of gpt-4-0314 The main improvement is they've finally thought it how to 'think' correctly. It no longer makes gruesome logic errors in its thinking. Problems like "Okay, I'll run tests now. <Tests fail> Great! The tests pass." are no longer a thing. This generalizes across to basically ALL logic when it comes to thinking about code - it extremely rarely, if ever, makes mistakes. The next big milestone: It no longer writes slop code! This is huge. With Codex, you can get it to write code that works. But it writes awful code - useless functions, bad abstractions, etc. This sucks, because it works short term, but long term the model will run itself into a corner where it can no longer work with the code it wrote itself. Not the case with Opus. Not only does it write elegant code, but it also knows how to refactor slop code into non-slop code. It deeply understands the codebase and can figure out elegant solutions that aren't just 'mechanical' refactorings. It's very autonomous and independent. It will, by itself, when encountering issues, create minimal reproducible examples, try to bisect where the error comes from, then fix it without getting stuck in rabbit holes. Even if the error is in some unrelated part of the code -- code that it didn't even write itself!! It also DOES EXACTLY WHAT YOU SAY, WITHOUT CUTTING CORNERS! This is huge!!! Using Codex is basically a game of whack-a-mole where it understands what you want it to do, but it's too difficult so it reward-hacks its way into a shit solution that you don't want. Opus actually tackles the problem and solves it properly even if it's difficult. The long context understanding is pretty much perfect. Paired with the compaction mechanism available in Claude Code by default, you can basically have an infinitely long conversation where it understands everything inside it, with no degradation. In terms of design, research, coming up with novel ideas. It's better, but not quite expert-human-level. It can propose solutions that I would consider good design, but it can't quite 'think with portals' yet. Still, a good improvement over what we had before, which was basically non-existent. All of the above I've gathered from testing it over the past few days where the task is to write an interpreter for a language that we were designing on the fly. It's a very niche design, similar to Self and Smalltalk, except we're building the language inside the language itself. This leads to extremely difficult scenarios where you're trying to define how functions work -- inside the language -- when you don't have functions yet! And it still does a magnificent job. Sometimes, I don't even fully understand what I'm asking it to do, but Opus does, and it does a good job. TL;DR: It's the Sonnet 3.5 of 2025. Try it. Do it now