theredbeard.io
|
ksl
|
|
Lars de Ridder built Context Lens to intercept and analyze how Claude Code, Codex CLI, and Gemini CLI actually use their context windows on the same Express.js debugging task. The differences are striking. Opus 4.6 averaged 27K tokens per run and leaned heavily on git history with surgical precision, while Gemini 2.5 Pro consumed 258K tokens on average — sometimes dumping hundreds of commits into a single query. Codex ignored git entirely and solved the problem in 34 seconds using ripgrep and sed. Every tool passed all 1,246 tests, so the output quality was equivalent; the cost and efficiency gap comes entirely from investigation strategy. The finding that context management on the tool side is essentially nonexistent tracks with what developers building on these agents have been noticing – the models work despite the tooling, not because of it.
