Back to overview
Degraded

API is degraded

Oct 09 at 09:18am PDT
Affected services
Vapi API

Resolved
Oct 09 at 09:24am PDT

We're back.

RCA:
* At 9:15am PT: We were alerted by a big spike in request aborted.
* By 9:20am: We identified the root cause was head of line blocking on the API pods (some requests were taking too long, blocking other requests)
* By 9:25am: We scaled and restarted the api pods. Everything reverted to normal.

Action Items:
* We'll be setting a hard query timeout and returning timeout on ones that exceed. Eg. GET /assistant?limit=1000. (statementtimeout)
* We'll be making API pods aware of the health of their own DB connection, so it can restart gracefully.
* We'll be lowering how long each API pod can hold a DB connection so it can't monopolize time (idle
timeout).

Created
Oct 09 at 09:18am PDT

We're investigating.