Amazon Web Services (AWS) has announced Fault Injection Simulator, a fully managed service for running controlled experiments on AWS.
Primarily used in chaos engineering, fault injection experiments subject applications to sudden stress, allowing engineering teams to observe how systems respond and implement improvements accordingly.
According to AWS, its new Fault Injection Simulator makes it easy for teams to monitor and inspect blind spots, performance bottlenecks, and other unknown vulnerabilities unidentified by conventional tests.
The tool comes with pre-built experiment templates that enable teams to gradually or simultaneously impair distinct applications’ performance in a production environment. For convenience, the simulator also provides controls and guardrails so teams can automatically roll back or stop the experiment when specific conditions are met.
What’s more, the simulator allows teams to create disruptive experiments across a range of AWS services, including Amazon EC2, Amazon EKS, Amazon ECS, and Amazon RDS. Teams can also run “GameDay scenarios or stress-test their most critical applications on AWS at scale,” said AWS.
For best results, AWS recommends enterprises integrate its simulator into their continuous delivery pipeline. Steadfast integration will enable teams to monitor and unearth production vulnerabilities constantly, improving application performance, observability, and resiliency.
“With a few clicks in the console, teams can run complex scenarios with common distributed system failures happening in parallel or building sequentially over time, enabling them to create the real world conditions necessary to find hidden weaknesses,” said AWS.
“nClouds is adding advanced chaos engineering capabilities and service offerings to our DevOps practice that will improve the resiliency of distributed service architectures we build for our customers and prove regulatory compliance,” comments Marius Ducea, VP of DevOps practice at nClouds.
“AWS Fault Injection Simulator has a deep level of fault injection that will enable us to create failure scenarios that more accurately reflect real-world events. With this capability, we expect to have an even better perspective on the expected time to recovery during real events.»