Call degradation on Weekly channel
Resolved
Feb 25, 2026 at 5:50pm UTC
Incident report:
Impact:
A service disruption affected call reliability on the Weekly channel. Some calls ended unexpectedly with worker-not-available or worker-died end reasons.
Timeline (all times PT):
- 8:07 AM - We detected a burst of call failures across the platform.
- 8:16 AM - Automated monitoring alert fired. We acknowledged and began investigating.
- 8:42 AM - We scoped the impact across affected accounts.
- 8:47 AM - The issue self-resolved. We identified the root cause as resource contention in our call processing infrastructure during a traffic spike.
- 9:18 AM - We completed an initial root cause analysis and identified an underlying bottleneck in our call queue infrastructure.
- 11:13 AM - A related issue resurfaced due to cascading effects from the earlier contention. We began investigating immediately.
- 11:25 AM - We published a status page to notify customers.
- 11:38 AM - We confirmed the root cause as CPU contention between infrastructure components.
- 11:39 AM - We applied a mitigation. Call queue metrics began recovering.
- 11:45 AM - We updated the status page with the identified cause and fix.
- 11:46 AM - Error rates began declining. We continued active monitoring.
- 1:11 PM - We declared resolution on the status page.
- ~1:35 PM - A brief secondary spike occurred during an infrastructure resource adjustment. We responded immediately.
- 3:21 PM - All systems fully stabilized.
Action Items
Enforce resource limits across processing components and improve infrastructure isolation for critical call processing.
Note
A full root cause analysis is underway and will be available upon request. We sincerely apologize for the disruption and thank you for your patience.
Affected services
Updated
Feb 24, 2026 at 11:21pm UTC
At ~1:35pm roughly 211 more calls dropped on the Weekly cluster. The team is investigating the matter, but we do not see any on-going degradation of service after 2pm PT.
We will share an incident report update with the full timeline and action items today. Internally, we are working on a more in-depth root cause analysis to understand deeply why our systems failed and what action we will take to make our platform more stable.
Affected services
Updated
Feb 24, 2026 at 9:11pm UTC
The issue is now resolved.
Affected services
Updated
Feb 24, 2026 at 7:45pm UTC
We identified an issue causing a small number of calls to terminate unexpectedly with worker-not-available or worker-died end reasons. A fix has been deployed and error rates are declining. We are continuing to monitor and will provide another update once fully resolved.
Affected services
Created
Feb 24, 2026 at 7:25pm UTC
We are seeing calls degraded on Weekly channel. The team is looking into the issue and will share updates here.
Affected services