Dec 6, 17:08 UTC
We are no longer seeing any impact on ingested datapoints, and we are now marking this incident as resolved.
Between 18:01 UTC and 19:10 UTC, we experienced delays of up to 3 minutes processing ingested metrics, which resulted in gaps in the leading edge of graphs.
From 19:10 UTC until 21:05 UTC, there was an improvement in the processing rate of our ingestion pipeline, but we still saw occasional spikes of up to a minute in the delay of processing incoming datapoints.
After 21:05 UTC, our ingestion layer has recovered and regular operation has resumed.
We will be publishing a postmortem before the end of this week for this incident.
Dec 5, 21:50 UTC
Our aggregation layer continues to work through the backlog of affected datapoints.
We will provide another update in one hour or when we have more information.
Dec 5, 20:24 UTC
Our aggregation layer has returned to a healthy state.
Leading edge data has recovered and we continue to work through backlogs of affected datapoints.
Dec 5, 18:44 UTC
The configuration change has been reverted and we are seeing recovery in our aggregation layer.
Delays remain in the processing of datapoints.
Dec 5, 18:33 UTC
As of 18:01 UTC, a configuration change has caused an issue in our aggregation layer
This will result in delays of datapoints being processed.
We are reverting the change now.
Dec 5, 18:25 UTC