Key Takeaways
- A hidden model selector in Google App v17.18.22 reveals seven previously unreported AI model options for Gemini Live voice conversations. My testing confirms they produce measurably different responses.
- Four models can access the user’s location and provide live weather data; three cannot. One model, “Capybara,” identifies itself as “Gemini 3.1 Pro” rather than the standard Flash Live model.
- The feature is currently hidden behind a server-side flag ahead of Google I/O 2026 and appears to be an internal testing tool. The infrastructure for switchable voice AI models is already in place.
May 11 Update below: I’ve tested each of the models and compared them. This article was originally published on May 9
Google has been testing multiple unknown AI models within Gemini Live, hinting at a possible upgrade to the voice-controlled chatbot in time for Google I/O 2026 later this month. The code reveals some clues about the company’s voice AI plans and why it could matter for the millions of people who use Gemini.
While investigating an unrelated unreleased feature in the Google app code, a previously unseen settings cog button caught my eye. Enabling the button in the code revealed a new model selector menu that allows the user to change the AI model Gemini Live uses to power its voice chats.
The menu, currently hidden behind a server-side flag, reveals seven AI model options I haven’t seen before, including codenames “Capybara,” “Nitrogen” and a specialized personalization model. The displayed menu options are as follows:
- Default
- A2A_Rev25_RC2
- A2A_Rev25_RC2_Thinking
- A2A_Rev23_P13n
- A2A_Nitrogen_Rev23
- A2A_Capybara
- A2A_Capybara_Exp
- A2A_Native_Input
Of these, “A2A_Rev25_RC2” and “A2A_Rev25_RC2_Thinking” appeared overnight on May 8, showing that Google now has two new audio-to-audio models at the Release Candidate 2 stage, nearing production readiness. The presence of a Thinking model is particularly interesting, as it suggests a variant with enhanced reasoning capabilities may soon be available.
What The Code Reveals
Currently, Gemini Live uses only one model — Gemini 3.1 Flash Live , a native input model designed to process raw audio and video streams directly. The existence of multiple new models strongly suggests that Google is trying out some alternatives internally before a public release.
No public information exists for these models, but there are a few small clues in the naming. Here, “A2A” most likely stands for Audio-to-Audio , Google’s term for models that process speech and audio directly, rather than converting them to text first.
The P13n Variant
In the screenshot above we see a model labeled “ P13n ,” a shortened form of the word “personalization.” This hints at a specialized Gemini variant with additional personalization and behavioral features baked directly into the model.
Why This Matters
While the regular Gemini interface currently offers users a choice between Fast, Thinking and Pro models, Gemini Live currently offers no such option.
Switchable models would allow the company to provide a more powerful voice assistant to customers willing to pay for it, or perhaps allow users to trade Gemini Live’s snappy responses for more thoughtful ones that take a little longer.
What We Don’t Know
Neither “Capybara” nor “Nitrogen” appear in any prior Google documentation. However, terms like “Rev25” and “Exp” suggest that the company has already been through several revisions of these models and likely has both stable and experimental versions of the Capybara model under test.
The list of available models is delivered by Google’s servers, meaning the company can add or remove models without an app update.
We don’t know at this point whether Google is about to bring model selection to Gemini Live users, or whether this is purely an internal testing tool. The model selector interface, as it stands, remains unpolished and isn’t ready for release.
My testing confirms that the selected model name is transmitted to Google’s servers when a voice session begins, but it remains unclear whether functionally different models are actually served in response.
May 11 Update: New and functionally different models confirmed! My test results reveal important distinctions between Google’s unreleased live Gemini models.
My testing confirms that switching models produces different behavior, although the exact nature of the differences, and whether they reflect distinct backend models, or alternate configurations of the same model, remain in question.
What My Testing Reveals
Since publishing this article, I’ve run a series of twelve tests on each of the new models to see whether they change the AI’s behavior. My results confirm that the models do indeed produce measurably different responses.
My most striking finding is that four of the models are able to access the user’s location when asked for a live weather update, whereas the remaining three cannot, instead prompting the user for their location before responding. This shows that, in their current implementation, some models have access to personal information and others don’t.
One model, “Capybara,” identified itself as “Gemini 3.1 Pro,” rather than the expected “Gemini 3.1 Flash Live,” model Google typically uses for interactive voice chats. Three of the models promised to remember personal information when asked, while three others refused.
Notably, two models, including Nitrogen, picked up on a deliberately false claim I made during testing, while the others accepted it without question. This suggests different levels of accuracy checking across the range of models.
These are more than cosmetic differences. Google appears to be road-testing distinct capabilities across its voice AI lineup ahead of Google I/O 2026.
Google I/O 2026 begins May 19. I’ll be keeping an eye on the model selector and updating this article with any developments, including whether users gain access to alternative models (as is the case with the regular Gemini app). For now, the infrastructure is in place and the list of models is growing.