All Systems Operational
Website   Operational
90 days ago
99.87 % uptime
Today
Graph rendering   Operational
90 days ago
99.68 % uptime
Today
Ingestion   Operational
90 days ago
99.8 % uptime
Today
Alerting   Operational
90 days ago
99.86 % uptime
Today
Operational
Degraded Performance
Partial Outage
Major Outage
Maintenance
www.hostedgraphite.com uptime ?
Fetching
Interface health: TCP ?
Fetching
Interface health: UDP ?
Fetching
Interface health: StatsD ?
Fetching
Interface health: HTTP API ?
Fetching
Interface health: carbon relay (pickle) ?
Fetching
Graph render time (95th percentile)
Fetching
Interface health: Heroku integration ?
Fetching
AWS connectivity (US-East-1) ?
Fetching
AWS connectivity (US-West-1) ?
Fetching
Past Incidents
Aug 19, 2018

No incidents reported today.

Aug 18, 2018
Resolved - As of 12:50 UTC all services have returned to normal operation. The root cause of the connectivity issues was due to a hardware fault in a router linecard. This issue is now resolved.
Aug 18, 12:52 UTC
Update - We are still experiencing intermittent network connectivity issues - graphs may return partial renders, datapoints are being ingested as normal and the processing of some alerts may still be delayed. We continue to work with our provider to resolve this issue.
Aug 18, 12:21 UTC
Update - We continue to experience intermittent connectivity issues, and we have switched our health checking to a more network fault tolerant approach to mitigate the impact of the issue as we work with our provider to identify the root cause of this issue.
Aug 18, 11:51 UTC
Update - We are still experiencing connectivity issues which may result in partial graph renders, delayed alerts and delays in processing ingested datapoints. We continue to work with our provider to resolve this issue, and we are working to identify the affected network paths to mitigate the issue in the interim.
Aug 18, 11:29 UTC
Identified - As of 11:01 UTC we are experiencing network connectivity issues across several of our services. Ingestion, alerting and graph rendering are all affected by these connectivity issues. We are working with our provider to resolve these issues.
Aug 18, 11:11 UTC
Aug 17, 2018
Resolved - At 15:17UTC, we identified an incorrect configuration change that affected our database health checks which resulted in our site being unavailable due to the lack of a healthy database. The change has been rolled back, and the website and API are now fully operational and the incident has been resolved.
Aug 17, 15:25 UTC
Investigating - At 15:10 UTC, the website started experiencing database issues. We're working to resolve the issue.
Aug 17, 15:18 UTC
Aug 16, 2018

No incidents reported.

Aug 15, 2018

No incidents reported.

Aug 14, 2018

No incidents reported.

Aug 13, 2018

No incidents reported.

Aug 12, 2018

No incidents reported.

Aug 11, 2018
Resolved - This incident has been resolved.
Aug 11, 04:45 UTC
Monitoring - We have worked through our backlog of data and there are no more delays present. We will be continuing to monitor this incident.
Aug 11, 04:30 UTC
Investigating - From 3:40 UTC to 4:05 UTC we experienced a delay in processing of metrics. This will cause gaps in the leading edge for render requests covering this time frame. We are currently investigating the issue.
Aug 11, 04:08 UTC
Aug 10, 2018

No incidents reported.

Aug 9, 2018
Resolved - As of 13:30 UTC, we have increased capacity in our render layer, and requests to our Render API are no longer failing.
Aug 9, 13:57 UTC
Investigating - We are experiencing increased load on our render layer, resulting in up to 0.25% of requests to our Render API. We are increasing capacity in order to deal with this.
Aug 9, 13:20 UTC
Aug 8, 2018

No incidents reported.

Aug 7, 2018
Resolved - As of 15:05 UTC, we have completed the isolation of the requests which were causing high load on our Render API. These requests are no longer impacting our API and the load has returned to normal.
Aug 7, 15:12 UTC
Identified - We have identified the cause of the high load as a subset of requests, and we are working to isolate these requests.
Aug 7, 14:55 UTC
Investigating - Since 14:15 UTC, we have seen increased render times and the Render API has been intermittently unavailable due to high load. We are working to mitigate this.
Aug 7, 14:24 UTC
Resolved - As of 08:15 UTC, our Ingestion and Render layers have stabilised and the connectivity issues are fully resolved.
Aug 7, 08:27 UTC
Update - As of 07:35 UTC, we have been informed by our hosting provider that the issue with the router has been fixed. We are monitoring the situation as we confirm that connectivity is restored while we deal with any remaining impact of the incident.
Aug 7, 07:53 UTC
Identified - As of 07:15 UTC, we are experiencing a network outage which is impacting both our Ingestion layer and Render layer. Our hosting provider has identified a fault in a specific router and we are attempting to work around the affected network path.
Aug 7, 07:28 UTC
Aug 6, 2018

No incidents reported.

Aug 5, 2018

No incidents reported.