Cloud computing and virtualization enable software teams to scale the number of instances a system hosts in response...
to changes in workload. Simply adding or removing a component like a microservice, however, doesn't automatically change how work distributes. To control work distribution, you need load balancing.
A microservices load-balancing strategy depends mostly on the way you develop the microservices and how the service discovery works. Expect your load balancer to distribute work in a quasi-random way and not necessarily account for the current state of all instances.
Stateless or stateful?
Microservices load balancing revolves around this question: What is work? Are your workloads single, independently handled messages, or are they strings of messages, where the entire string goes to the same place to process correctly? This is what defines either stateless or stateful load balancing, respectively. With microservices, the goal is often to achieve stateless behavior.
Any instance of a stateless microservice can process any message because you process each message without regard for what came before or what will come after. If dealing with a stateful microservice, opt for a state-sensitive load-balancing technique or an approach that preserves whatever state control technique that microservice uses. It's possible to use stateless load balancing for microservices that have back-end state control, but if a microservice is truly stateful at the development level, then it probably needs stateful load balancing.
And then there is the distribution issue. Distributed software approaches largely replaced traditional approaches that included hardware load balancers located in front of or within a data center. The reason for this shift toward the distributed approach is that it's difficult to make hardware systems work for public, hybrid or multi-cloud hosting.
Hardware load balancers focus on traffic-based work distribution, usually by examining the header of the packets. The hardware approach typically presents two main problems. First, traffic inspection can provide a significant processing load on the device, particularly if it delivers the packets via HTTPS. Second, you can't scale, redeploy or redistribute hardware to accommodate failures or traffic changes. These hardware complications make it hard to apply to microservices, so most microservices planners moved to another approach.
Balancing the options
Software load balancers generally fit into two categories. The first option distributes work from clients to microservices instances by integrating with the current work distribution process. The second creates a new load-balancing path behind any API brokerage or service bus.
Access to microservices typically occurs through an API broker that delivers an instance address. These brokers require that the microservice register with the broker when loaded, and if the broker supports registration of multiple instances, it can perform load balancing among them. The processes used to detect instance failures keep the load-balancing process up to date with available resources. A service bus can also access a microservice, in which case the service bus implementation can support multiple instances of microservices and manage them in the same way. Obviously, this won't work unless you have an implementation of an API broker or service bus that supports it.
If you don't access microservices via a broker or service bus, it's likely that a URL that is decoded by a domain name system (DNS) server will reference the microservice. Most DNS servers allow multiple IP addresses for a given URL, and they step through the list of addresses in round robin fashion as it makes requests to decode. This process provides the work with equally distributed load balancing, provided that no work source caches the IP addresses.
The external route
The second option for microservices work distribution is to add a component or set of components to provide external load balancing. Software instances of traffic-based load balancers can provide this type of work distribution, especially if deployed in a hierarchical structure where a master load balancer sends work to subordinate elements. If there are multiple components, often, each one of those components is placed within a cloud provider network or data center to manage the microservices instances hosted there. This distributes work and also reduces the performance impact of the load-balancing process.
Another microservices load-balancing mechanism is to use a software component that refers to a service registry, such as Consul, etcd or Eureka. Unlike a traffic-based load balancer that works on the packet flow, this registry-driven system regulates the microservices instance delivered when a caller tries to locate the service via the registry.
The registry-driven Ribbon model is today's best approach to microservices load balancing if you already have, or plan to install, a compatible service registry and if your microservices have proper state control. Ribbon can also provide a kind of stateful load balancing if the calling process makes the service instance connection at the start of a transaction and keeps the association through the rest of the message flow.
IBM and Google sponsor a microservices mesh of distributed services that can deploy anywhere. This model, called Istio, uses a sidecar -- an attached network proxy -- which provides the connection for each microservice. The proxy-based data plane can then use distributed policies to run and monitor connectivity, and it also provides load balancing. For true cloud-native microservices, Istio -- or something like it -- is the way to go.