Google Gemini Rate Limiting
Resolved
Feb 03 at 01:16pm PST
Google is still resolving capacity issues on their end, but we have put a mitigation in place to resolve this for gemini-2.5-flash. Please switch to this model when using Google, if you require another model reach out to support@vapi.ai.
We are continuing to monitor and work with the Google team to resolve.
Affected services
Updated
Jan 27 at 09:47am PST
We are still seeing rate limiting issues for Google LLMs and are looking into another fix we can implement to mitigate it.
This is likely caused by regional exhaustion of Google Vertex AI resources rather than us hitting an org-level quota.
Affected services
Updated
Jan 26 at 10:50pm PST
Google has confirmed the underlying issue is resolved. We’re continuing to deploy a mitigation to ensure this doesn’t impact customers if it recurs.
Affected services
Updated
Jan 25 at 08:03pm PST
Google has not resolved the issue on their side, we have requested an updated timeline.
Our team is working on implementing fallbacks for the services impacted by the Google degradation (the query tool).
Affected services
Updated
Jan 22 at 10:47am PST
Google is again reporting issues with the Vertex AI API that is impacting both our default and fallback Gemini clients.
Consider using a different model. We are working with the provider to resolve this and will update here.
Affected services
Updated
Jan 21 at 11:30pm PST
We have confirmed with the provider that the issue is from their end. We have implemented fallbacks that should help mitigate this issue going forward.
We apologize for any disruption to service as a result of this issue.
Affected services
Updated
Jan 21 at 12:11pm PST
Google/Gemini Service Degradation - Immediate Workarounds
We're experiencing intermittent rate limiting from Google affecting several Vapi features. We're working with Google to resolve this. In the meantime, there are immediate workarounds for affected features.
- Model (LLM): Gemini models may intermittently fail.
- Workaround one - switch to a different model (e.g., GPT 4.1)
- Workaround two - obtain an API key from Google and use that.
- Vapi Dashboard → Settings → Integrations → Custom Credentials
- Transcriber (STT): Gemini-based transcription may intermittently fail
- Two workarounds - switch your primary or fallback transcriber to a different model (e.g., Deepgram Nova 3) or obtain an API key from Google and use that.
- Voicemail detection: may intermittently fail if "provider" is set to "google"
- Two workarounds. Switch to "openai" or "twilio" provider (if using Twilio telephony) or turn off
voicemailDetectionand switch to Voicemail tool
- Two workarounds. Switch to "openai" or "twilio" provider (if using Twilio telephony) or turn off
- Query tool: may intermittently fail since it relies on Google infrastructure
- Two workarounds (both high effort) - switch to a custom knowledgebase or use a function tool to replicate behavior
- Structured Outputs: Gemini models may intermittently fail.
- Workaround - switch to a different model provider: OpenAI or Anthropic
- Speech-to-Speech: Gemini models may intermittently fail.
- Workaround - switch to a different model provider: OpenAI
Affected services
Created
Jan 21 at 11:40am PST
We are hitting rate limits again with our Google Gemini models. We are working with the vendor to resolve this issue. Please consider using another model at this time or implementing fallbacks.
Affected services