Storage Layer for 5s Resolution Unstable
Incident Report for Hosted Graphite
Resolved
All buffers for storage writers have been completely replayed. The system is back in full health.
Posted Apr 10, 2020 - 08:51 UTC
Update
Only customers who have 5s resolution storage may still see sporadic missing datapoints when viewing 0-1h.
Majority of the customers will not see any impact.
90% of the storage writer buffers have been fully replayed.
Posted Apr 10, 2020 - 08:31 UTC
Monitoring
The storage layer has been stabilized in the last 2 hours, and datapoints held in the buffers of storage writers are being replayed. We are monitoring the storage laery and the replay to ensure that there are no issues.
Posted Apr 10, 2020 - 06:52 UTC
Identified
We've identified that the storage layer for 5s resolutions are unstable. We are taking necessary actions to stabilize it. In the meantime the datapoints are being held in the buffer of the storage writers. Datapoints for time period between 0-1 hour relies on 5s resolution storage may be missing. Alerts based on a single metric, i.e., not reliant on graphite function will function without degradation. Alerts reliant on graphite function may be delayed.
Posted Apr 10, 2020 - 03:46 UTC
This incident affected: Graph rendering, Ingestion, and Alerting.