Q
Get started Bring yourself up to speed with our introductory content.

How can Chaos Monkey testing help with microservices?

Resilience testing isn't just for infrastructure. Architects can adopt this disaster recovery testing strategy to build more reliable microservice applications.

Chaos Monkey is a popular open source tool developed by Netflix that takes reliability testing to new heights....

The idea behind Chaos Monkey testing is to deliberately kill random nodes across the system at regular intervals to assess whether the system can survive despite these failures. The Chaos Monkey testing principle can help evaluate the reliability of microservice-based applications, but rather than intentionally kill nodes, architects should focus on the interruption of services.

As one service fails, other dependent services could stall or fail in a ripple effect. This delivers a bad user experience. There are many ways to deal with this issue, and when used in combination, they can help architects design more resilient systems.

Fallback services matter

Replicas and fallbacks are not just for infrastructure components, but for mission-critical services as well. Consider creating an alternate, bare-bones service that can take on the load if the default service fails. This is especially useful for services such as billing and transaction processing for e-commerce applications.

Setting up a fallback service could be as simple as routing traffic away from the failed service. Before flipping the Chaos Monkey testing switch, though, you need to ensure backup services for critical processes are ready to take over.

Building resilient infrastructure

With modern container stacks, it is now possible -- even easy -- to automatically restart failed containers and set up autoscaling container clusters. This automation helps ensure resiliency in the infrastructure layer.

At the networking level, shortening timeout limits will ensure services are quickly rerouted to a fallback service after a failure. This helps better optimize the system for performance.

Having persistent data storage for containers is also essential. When a container fails, its stored data is lost unless you configure persistent storage volumes. Persistent storage ensures that, even if a service fails, it can be easily resumed with the old, stored data.

When all efforts still result in lost data, a disaster recovery (DR) tool can ensure you always have access to your data in the event of mass failures. DR tools and services are worth purchasing if you need true data resilience.

Whether it's the service layer or the infrastructure layer, you can build more resilient applications by employing the Chaos Monkey testing principle in your microservice applications.

This was last published in February 2018

Dig Deeper on Microservices testing and QA

Have a question for an expert?

Please add a title for your question

Get answers from a TechTarget expert on whatever's puzzling you.

You will be able to add details on the next page.

Join the conversation

1 comment

Send me notifications when other members comment.

By submitting you agree to receive email from TechTarget and its partners. If you reside outside of the United States, you consent to having your personal data transferred to and processed in the United States. Privacy

Please create a username to comment.

How have Chaos Monkey principles changed your approach to application design?
Cancel

-ADS BY GOOGLE

SearchSoftwareQuality

SearchCloudApplications

SearchAWS

TheServerSide.com

SearchWinDevelopment

DevOpsAgenda

Close