Back to overview
Degraded

Calls not connecting for `weekly` channel

Jan 13 at 08:31am PST
Affected services
Vapi API

Resolved
Jan 13 at 08:49am PST

TL;DR: Scaler failed and we didn't have enough workers

Root Cause

During a weekly deployment, Redis IP addresses changed. This prevented our scaling system from finding the queue, leaving us stuck at fixed number workers instead of scaling up as needed. We resolved the issue by temporarily moving traffic to our daily environment.

Timeline

Jan 11, 5:12 PM: Deploy started
Jan 13, 6:00 AM: Calls started failing due to scaling issues
Jan 13, 8:45 AM: Resolved by moving traffic to daily
Jan 13, 11:00 AM: Full service restored

Changes We've Implemented

  • Load testing on every deploy
  • Added better monitoring for scaling errors

If working on realtime distributed systems excites you, consider applying: https://jobs.ashbyhq.com/vapi/295f5269-1bb5-4740-81fa-9716adc32ad5

Created
Jan 13 at 08:31am PST

We're investigating. We'll update ASAP.