Call degradation on Weekly ...

Resolved
Feb 25, 2026 at 5:50pm UTC

Incident report:

Impact:

A service disruption affected call reliability on the Weekly channel. Some calls ended unexpectedly with worker-not-available or worker-died end reasons.

Timeline (all times PT):

8:07 AM - We detected a burst of call failures across the platform.
8:16 AM - Automated monitoring alert fired. We acknowledged and began investigating.
8:42 AM - We scoped the impact across affected accounts.
8:47 AM - The issue self-resolved. We identified the root cause as resource contention in our call processing infrastructure during a traffic spike.
9:18 AM - We completed an initial root cause analysis and identified an underlying bottleneck in our call queue infrastructure.
11:13 AM - A related issue resurfaced due to cascading effects from the earlier contention. We began investigating immediately.
11:25 AM - We published a status page to notify customers.
11:38 AM - We confirmed the root cause as CPU contention between infrastructure components.
11:39 AM - We applied a mitigation. Call queue metrics began recovering.
11:45 AM - We updated the status page with the identified cause and fix.
11:46 AM - Error rates began declining. We continued active monitoring.
1:11 PM - We declared resolution on the status page.
~1:35 PM - A brief secondary spike occurred during an infrastructure resource adjustment. We responded immediately.
3:21 PM - All systems fully stabilized.

Action Items

Enforce resource limits across processing components and improve infrastructure isolation for critical call processing.

Note

A full root cause analysis is underway and will be available upon request. We sincerely apologize for the disruption and thank you for your patience.

Updated
Feb 24, 2026 at 11:21pm UTC

At ~1:35pm roughly 211 more calls dropped on the Weekly cluster. The team is investigating the matter, but we do not see any on-going degradation of service after 2pm PT.

We will share an incident report update with the full timeline and action items today. Internally, we are working on a more in-depth root cause analysis to understand deeply why our systems failed and what action we will take to make our platform more stable.

Updated
Feb 24, 2026 at 9:11pm UTC

The issue is now resolved.

Updated
Feb 24, 2026 at 7:45pm UTC

We identified an issue causing a small number of calls to terminate unexpectedly with worker-not-available or worker-died end reasons. A fix has been deployed and error rates are declining. We are continuing to monitor and will provide another update once fully resolved.

Created
Feb 24, 2026 at 7:25pm UTC

We are seeing calls degraded on Weekly channel. The team is looking into the issue and will share updates here.

Call degradation on Weekly channel