blog.google
|
ksl
|
|
Google released Gemini 3.1 Pro with a verified 77.1% on ARC-AGI-2, more than doubling its predecessor’s reasoning performance in roughly three months. SWE-Bench Verified lands at 80.6% and GPQA Diamond at 94.3%, all at the same price as Gemini 3 Pro – effectively a free upgrade for every API user. The .1 naming is new for Google, which previously used .5 for mid-cycle refreshes, and the faster cadence points to a tighter release loop. Deep Think reasoning, once limited to a separate model variant, now ships inside the main Pro tier. OpenAI and Anthropic have been on similar rapid-iteration cycles, but Google matching frontier benchmarks without a price increase puts real pressure on how long premium reasoning can justify premium pricing.
