The incident has been resolved. A full post-mortem will come shortly.
Feb 6, 13:00 PST
The system has started to recover–we're now seeing p95 latencies of delayed events at 25 minutes, and new data should be flowing in real-time.
During the outage, a number of our destinations containers were crashing, resulting in duplicate messages being processed and delivered. Approximately 3% of traffic was delivered multiple times to downstream tools from 9am to 11am PT
Feb 6, 10:55 PST
We're seeing delays of up to 30 minutes, but the system has started to stabilize. We will keep you updated as the situation progresses
Feb 6, 10:25 PST
We're currently seeing delays of up to 15 minutes to downstream tools. Data is not being lost, and our on-call engineers are currently working to resolve the issue.
Feb 6, 09:50 PST