This is the WSO2 Elastic Load Balancer documentation version 2.0.1. View documentation for the latest release.

Load Balacing Basics

This section details out some basic terminology and architectural concepts implemented in the WSO2 ELB.

A load balancer is mainly used to distribute the load of incoming traffic amongst a set of backend worker nodes. This set of worker nodes can be either statically configured or dynamically discovered.

In the static mode, new nodes cannot be added to the pre-defined set of worker nodes at runtime. Dynamic load balancers such as the WSO2 ELB, also support addition and removal of worker nodes at runtime, without having to know the IP addresses and other connection details of the backend nodes beforehand.

Key information of the load balancing policy or how the load is distributed across multiple backend worker nodes is specified in the load balancing algorithm.

Elastic Load Balancer

An elastic load balancer such as the WSO2 ELB, in addition to carrying out its traditional load balancing functionality, is also responsible for monitoring the load and starting up or terminating existing worker nodes, depending on the load. It scales up the system when the load increases and scales down when the load decreases. This  behavior is known as auto-scaling.

In a typical architecture, load balancing and auto-scaling are handled by two logically distinct components. It may even be possible to deploy the load balancer component and the auto-scaler component separately.

Auto-scaling capability is essential, specially in a Cloud-based deployment architecture, to utilize the Cloud's capabilities such as multi-tenancy, elasticity etc. to their fullest potential.

Session Affinity

Stateful applications inherently do not scale well. Therefore, architects minimize server side state in order to gain better scalability. State replication induce huge performance overheads on the system. As a solution to the problem of deploying stateful applications in clusters, session-affinity-based load balancing has been introduced.

Session affinity ensures that, when a client sends a session ID, the load balancer forwards all requests containing a particular session ID to the same backend worker node, irrespective of the specified load balancing algorithm. This may look like defeating the purpose of load balancing. But, before the session is created, the request will be dispatched to the worker node which is next in-line, and a session will be established with that worker node.

Service-Aware Load Balancing

Much of the load balancing processes of a real production environment, do not happen at the load balancer level, but the backend worker nodes. As a result, a typical load balancer is designed to front a large number of backend worker nodes. In a traditional deployment, one LB may front a cluster of homogenous worker nodes. One load balancer is generally capable of handling multiple such clusters, and route traffic to the correct cluster, while balancing load according to the algorithm specified for that cluster.

A cluster of homogeneous worker nodes is called a Cloud Service, in Cloud deployments. A load balancer such as the WSO2 ELB, which fronts multiple Cloud services is typically called a service-aware load balancer.

Tenant-Aware Load Balancing

When a typical Cloud deployment scales, it requires tenant-partitioning. For a single Cloud service, there will be multiple clusters and each of these Service clusters will handle a subset of the tenants in the system.In such a tenant-partitioned deployment, the load balancers themselves need to be tenant-aware, in order to be able to route the requests to the proper tenant clusters. In a Cloud environment, a tenant-aware load balancer should also be service-aware, since it is the service clusters that are partitioned according to the tenants.