Updates to DB are failing
Resolved
Jan 21 at 05:23am PST
TL;DR
A configuration error caused the production database to switch to read-only mode, blocking write operations and eventually leading to an API outage. Restarting the database restored service.
Timeline
5:03:04am: A SQL client connected to the production database via the connection pooler, which inadvertently set the database to read-only.
5:05am: Write operations began failing.
5:18am: The API went down due to accumulated errors.
~5:23am: The team initiated a database restart.
5:25am: The database restarted.
5:33am: Service was fully restored.
Impact
Write operations were blocked for 30 minutes.
The API experienced a 15-minute outage.
Root Cause
A direct connection from a SQL client, configured in read-only mode, propagated this setting across all sessions through the connection pooler. This disabled updates, inserts, and deletes, eventually leading to API failure.
Changes we've made
Disable Replication Jobs: Halt the replication jobs suspected of triggering the issue.
Escalate Support: The support case is escalated to the relevant team with a 24-hour follow-up.
Enhance Auditing: Enable and configure detailed audit logging (DDL and role operations) to help trace future incidents.
Restrict Direct Access: Eliminate direct production database connections by updating the access credentials.
If working on realtime distributed systems excites you, consider applying: https://jobs.ashbyhq.com/vapi/295f5269-1bb5-4740-81fa-9716adc32ad5
Affected services
Vapi API
Created
Jan 21 at 05:20am PST
We are investigating.
Affected services
Vapi API