Google Gemini Rate Limiting

Resolved
Feb 03 at 01:16pm PST

Google is still resolving capacity issues on their end, but we have put a mitigation in place to resolve this for gemini-2.5-flash. Please switch to this model when using Google, if you require another model reach out to support@vapi.ai.

We are continuing to monitor and work with the Google team to resolve.

Updated
Jan 27 at 09:47am PST

We are still seeing rate limiting issues for Google LLMs and are looking into another fix we can implement to mitigate it.

This is likely caused by regional exhaustion of Google Vertex AI resources rather than us hitting an org-level quota.

Updated
Jan 26 at 10:50pm PST

Google has confirmed the underlying issue is resolved. We’re continuing to deploy a mitigation to ensure this doesn’t impact customers if it recurs.

Updated
Jan 25 at 08:03pm PST

Google has not resolved the issue on their side, we have requested an updated timeline.

Our team is working on implementing fallbacks for the services impacted by the Google degradation (the query tool).

Updated
Jan 22 at 10:47am PST

Google is again reporting issues with the Vertex AI API that is impacting both our default and fallback Gemini clients.

Consider using a different model. We are working with the provider to resolve this and will update here.

Updated
Jan 21 at 11:30pm PST

We have confirmed with the provider that the issue is from their end. We have implemented fallbacks that should help mitigate this issue going forward.

We apologize for any disruption to service as a result of this issue.

Updated
Jan 21 at 12:11pm PST

Google/Gemini Service Degradation - Immediate Workarounds
We're experiencing intermittent rate limiting from Google affecting several Vapi features. We're working with Google to resolve this. In the meantime, there are immediate workarounds for affected features.

Model (LLM): Gemini models may intermittently fail.
- Workaround one - switch to a different model (e.g., GPT 4.1)
- Workaround two - obtain an API key from Google and use that.
  - Vapi Dashboard → Settings → Integrations → Custom Credentials
Transcriber (STT): Gemini-based transcription may intermittently fail
- Two workarounds - switch your primary or fallback transcriber to a different model (e.g., Deepgram Nova 3) or obtain an API key from Google and use that.
Voicemail detection: may intermittently fail if "provider" is set to "google"
- Two workarounds. Switch to "openai" or "twilio" provider (if using Twilio telephony) or turn off voicemailDetection and switch to Voicemail tool
Query tool: may intermittently fail since it relies on Google infrastructure
- Two workarounds (both high effort) - switch to a custom knowledgebase or use a function tool to replicate behavior
Structured Outputs: Gemini models may intermittently fail.
- Workaround - switch to a different model provider: OpenAI or Anthropic
Speech-to-Speech: Gemini models may intermittently fail.
- Workaround - switch to a different model provider: OpenAI

Created
Jan 21 at 11:40am PST

We are hitting rate limits again with our Google Gemini models. We are working with the vendor to resolve this issue. Please consider using another model at this time or implementing fallbacks.