{"id":7054,"date":"2013-04-18T14:12:15","date_gmt":"2013-04-18T14:12:15","guid":{"rendered":"http:\/\/cloudcomputing.sys-con.com\/node\/2618268"},"modified":"2013-04-18T14:12:15","modified_gmt":"2013-04-18T14:12:15","slug":"amazon-outage","status":"publish","type":"post","link":"https:\/\/icloud.pe\/blog\/amazon-outage\/","title":{"rendered":"Amazon Outage"},"content":{"rendered":"<p>You don\u2019t have to be a pre-cog to find and deal with infrastructure and application problems; you just need good monitoring.  We had quite a day Monday during the EC2 EBS availability incident.  Thanks to some early alerts\u2014which started coming in about 2.5 hours before AWS started reporting problems\u2014our ops team was able to intervene and make sure that our customers\u2019 data was safe and sound. I\u2019ll start with screenshots of what we saw and experienced, then get into what metrics to watch and alert on in your environment, as well as how to do so in TraceView.<br \/>\n10:30 AM EST: Increased disk latency, data pipeline backupAround<br \/>\n10am, we started to notice that writes weren\u2019t moving through our pipeline as smoothly as before.  Sure enough, pretty soon we started seeing alerts about elevated DB load and disk latency.  Here\u2019s what it looked like:<\/p>\n<p><a href=\"http:\/\/cloudcomputing.sys-con.com\/node\/2618268\" >read more<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>You don\u2019t have to be a pre-cog to find and deal with infrastructure and application problems; you just need good monitoring.  We had quite a day Monday during the EC2 EBS availability incident.  Thanks to some early alerts\u2014which started coming in a&#8230;<\/p>\n","protected":false},"author":3,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[],"tags":[],"class_list":["post-7054","post","type-post","status-publish","format-standard","hentry"],"_links":{"self":[{"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/posts\/7054","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/users\/3"}],"replies":[{"embeddable":true,"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/comments?post=7054"}],"version-history":[{"count":0,"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/posts\/7054\/revisions"}],"wp:attachment":[{"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/media?parent=7054"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/categories?post=7054"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/icloud.pe\/blog\/wp-json\/wp\/v2\/tags?post=7054"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}