bhusalmanish.com.np
|
ksl
|
|
Manish Bhusal makes a practitioner’s case for why Claude keeps winning developer preference despite competitors regularly topping coding benchmarks. His core argument is that raw code generation accounts for maybe 40% of what matters – the rest is file selection, multi-step task coherence, knowing when to stop, and not breaking surrounding code. Gemini handles isolated tasks well but tends to loop and lose context mid-workflow. Codex is closing the gap on agentic work. Anthropic’s edge, Bhusal argues, comes from optimizing specifically for the process of coding rather than benchmark-shaped problem solving, while Google spreads its training focus across too many use cases. The split between benchmark performance and daily usability keeps widening across every major model comparison developers actually run.
