Google Gemini Rate Limiting

Resolved
Feb 3, 2026 at 9:16pm UTC

Google is still resolving capacity issues on their end, but we have put a mitigation in place to resolve this for gemini-2.5-flash. Please switch to this model when using Google, if you require another model reach out to support@vapi.ai.

We are continuing to monitor and work with the Google team to resolve.

Updated
Jan 27, 2026 at 5:47pm UTC

We are still seeing rate limiting issues for Google LLMs and are looking into another fix we can implement to mitigate it.

This is likely caused by regional exhaustion of Google Vertex AI resources rather than us hitting an org-level quota.

Updated
Jan 27, 2026 at 6:50am UTC

Google has confirmed the underlying issue is resolved. We’re continuing to deploy a mitigation to ensure this doesn’t impact customers if it recurs.

Updated
Jan 26, 2026 at 4:03am UTC

Google has not resolved the issue on their side, we have requested an updated timeline.

Our team is working on implementing fallbacks for the services impacted by the Google degradation (the query tool).

Updated
Jan 22, 2026 at 6:47pm UTC

Google is again reporting issues with the Vertex AI API that is impacting both our default and fallback Gemini clients.

Consider using a different model. We are working with the provider to resolve this and will update here.

Updated
Jan 22, 2026 at 7:30am UTC

We have confirmed with the provider that the issue is from their end. We have implemented fallbacks that should help mitigate this issue going forward.

We apologize for any disruption to service as a result of this issue.

Updated
Jan 21, 2026 at 8:11pm UTC

Google/Gemini Service Degradation - Immediate Workarounds
We're experiencing intermittent rate limiting from Google affecting several Vapi features. We're working with Google to resolve this. In the meantime, there are immediate workarounds for affected features.

Model (LLM): Gemini models may intermittently fail.
- Workaround one - switch to a different model (e.g., GPT 4.1)
- Workaround two - obtain an API key from Google and use that.
  - Vapi Dashboard → Settings → Integrations → Custom Credentials
Transcriber (STT): Gemini-based transcription may intermittently fail
- Two workarounds - switch your primary or fallback transcriber to a different model (e.g., Deepgram Nova 3) or obtain an API key from Google and use that.
Voicemail detection: may intermittently fail if "provider" is set to "google"
- Two workarounds. Switch to "openai" or "twilio" provider (if using Twilio telephony) or turn off voicemailDetection and switch to Voicemail tool
Query tool: may intermittently fail since it relies on Google infrastructure
- Two workarounds (both high effort) - switch to a custom knowledgebase or use a function tool to replicate behavior
Structured Outputs: Gemini models may intermittently fail.
- Workaround - switch to a different model provider: OpenAI or Anthropic
Speech-to-Speech: Gemini models may intermittently fail.
- Workaround - switch to a different model provider: OpenAI

Created
Jan 21, 2026 at 7:40pm UTC

We are hitting rate limits again with our Google Gemini models. We are working with the vendor to resolve this issue. Please consider using another model at this time or implementing fallbacks.