Elevated render times and reduced ingestion traffic
Incident Report for Hosted Graphite
Resolved
We have reverted a previous configuration change and things are now back to normal.

Our ingestion pipeline was approximately dropping 16% of incoming traffic from 18:13 to 18:28 (UTC). The elevated render times proved to be a false alarm as the 99th percentile remain unchanged during the whole incident.

We have traced the root cause of the issue back to a configuration change made while testing improvements to our DNS automation.
Our DNS automation worked as expected but the configuration change had side effects we weren't expecting, that caused traffic from one of our load balancers to be rejected by our ingestion pipeline.
We now understand the side effects of this kind of change and we will work on adding extra safeguards around it in the future.
Posted Dec 02, 2016 - 18:44 UTC
Investigating
We're currently experiencing elevated response times and seeing reduced levels of traffic across our ingestion pipeline.
Posted Dec 02, 2016 - 18:30 UTC