Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In You cluster services in production environments , services are clustered in order to scale up applications , and/or achieve high availability or to achieve both. By scaling up, the application supports a larger number of can support more user requests and through high availability, the service is available even if when a few servers are down. To support balancing of load among these servers,

You use a load balancer is used to distribute requests among the nodes in the a cluster. The nodes that receive this incoming traffic are a set of backend worker nodes in a worker/manager separated cluster or in a cluster that does not support worker/manager separation. This set of worker nodes can be either statically configured or dynamically discovered. They are either pre-defined (static) or discovered dynamically. In the static mode, you cannot add new nodes cannot be added to the pre-defined set of worker nodes at runtime, while dynamic load balancers support addition and removal of worker nodes at runtime, without having to know the IP addresses . In the dynamic mode, you can add nodes to the load balancer at runtime without knowing the IPs and other connection details of the backend nodes beforehand.

Load balancers come in wide varieties and among them are hardware load balancers, DNS load balancers, transport level load balancers Among the many varieties of load balancers are hardware, DNS, transport-level (e.g., HTTP level like Apache , Tomcat), and application-level load balancers (like e.g., Synapse). High-level load balancers, like application-level load balancers, operate with more information about the messages they route and hence therefore, provide more flexibility , but also incur more overhead. So the The choice of a load balancer is a trade-off between performance and flexibility.

There are a wide variety of many algorithms or methods of for distributing the load between servers. Random or round-robin distribution of load are simple approaches, while more . More sophisticated algorithms consider runtime properties in the system like the machine's load or the number of pending requests into consideration. Furthermore, the The distribution can also be controlled by application-specific requirements like sticky sessions. However, it is worth noting that with With a reasonably diverse set of users, simple approaches tend to perform on par with complex approaches and therefore, they should be given the considerations first.

In WSO2 Carbon-based products, cluster messages are used based on axis2 clustering to identify a node that is joining or leaving the cluster.

The following are some key aspects of load balancing.

Table of Contents
maxLevel3
minLevel3

perform as well as complex ones.

Session affinity

Stateful applications inherently do not scale well. Therefore, architects minimize server-side state in order to gain better scalability. State replication induces huge a performance overheads overhead on the system. As a solution to the problem Instead of deploying stateful applications in clustersa cluster, you can use session-affinity-based load balancing can be used.

Session affinity ensures that, when a client sends a session ID, the load balancer forwards all requests containing a particular the session ID to the same backend worker node, irrespective of the specified load balancing algorithm. This may look like defeating the purpose of load balancing. But, before the Before the session is created, the request will be is dispatched to the worker node which that is next-in-line , and a session will be is established with that worker node.

Service-aware load balancing

Service-awareness provides a cost-effectiveness when used not only in the cloud but also on-premise. In addition, a single load balancer can balance incoming requests to clusters of different services such as Application Servers, Business Process Servers, Mashup Servers etc.

Most of the load balancing processes in a real production environment do not happen at the load balancer level, but the backend worker nodes. As a result, a typical load balancer is designed to front a large number of backend worker nodes. In a traditional deployment, one LB may front a cluster of homogenous worker nodes. One load balancer is generally capable of handling multiple such clusters, and route traffic to the correct cluster, while balancing load according to the algorithm specified for that cluster.

A cluster of homogeneous worker nodes is called a Cloud service, in Cloud deployments. A load balancer that fronts multiple Cloud services is typically called a service-aware load balancer.

Tenant-aware load balancing

Tenant-awareness allows the load balancer to provide a scalable approach for balancing the load across a set of tenants sharing a collection of worker nodes. Tenants can also be partitioned in various ways.

When a typical Cloud deployment scales, it requires tenant-partitioning. For a single Cloud service, there can be multiple clusters and each of these service clusters can handle a subset of the tenants in the system. In such a tenant-partitioned deployment, the load balancers themselves need to be tenant-aware, in order to be able to route the requests to the proper tenant clusters. In a Cloud environment, a tenant-aware load balancer should also be service-aware, since it is the service clusters that are partitioned according to the tenants.