blog.google
|
ksl
|
|
Google released Gemini 3.1 Flash Live in preview, a multimodal voice model built for low-latency audio and video streaming with tool use for AI agents. It scored 90.8% on ComplexFuncBench Audio – nearly 20% above the previous generation – and set a record on Audio MultiChallenge. The model handles over 90 languages, filters background noise from speech more reliably, and doubles conversation context length in Gemini Live on Android and iOS. Google is using it to roll out Search Live globally across 200+ countries. The voice-first interface push is accelerating across all major labs, with OpenAI, Anthropic, and Google each shipping dedicated audio models within the same quarter. Latency and noise robustness are becoming the new benchmarks that matter.
