Missing Data for a Percentage of Metrics
Incident Report for Hosted Graphite
Resolved
All datapoints from the impacted aggregation server have been successfully written out to our persistent storage backend.
Posted Feb 13, 2017 - 16:49 UTC
Monitoring
All datapoints have now been restored and normal service has resumed. We will resolve this incident when the datapoints from the impacted server have been written out to our persistent storage backend.
Posted Feb 13, 2017 - 15:49 UTC
Update
The replay has been completed for the long term resolution (1 hour) so graphs over long periods have been repaired. We're replaying the missing data for the other resolutions, which will repair shorter term graphs over the next couple of hours.
Posted Feb 13, 2017 - 13:27 UTC
Identified
We have identified a failure in one of our aggregation servers resulting in the loss of data from it's leading edge cache affecting leading edge reads for approximately 2% of all metrics for all resolutions.
Posted Feb 13, 2017 - 13:00 UTC