Elastic Load Balancer

Auto scaling decision maker

Location of auto scaling decision making task

The auto scaling decision making task currently resides on the WSO2 Elastic Load Balancer.

Basis for auto scaling

The current default implementation (ServiceRequestsInFlightAutoscaler) considers the number of requests in-flight as the basis for making auto scaling decisions. In the default algorithm we follow the paradigm “scale up early and scale down slowly”.

Decision making variables

There are few auto scaling decision making variables, out of which all of the vital variables are configurable using the loadbalancer.conf file.

autoscaler_task_interval (t)
This refers to the time period between two iterations of an 'autoscaling decision making task'. When configuring this value, you are advised to consider the time ‘that a service instance takes to join Elastic Load Balancer (ELB)’. This variable is stated milliseconds and the default value is 30000ms.
max_requests_per_second (Rps)
This refers to the number of requests a service instance can withstand per second. It is recommended that you determine this value for each service instance and for different scenarios. The ideal way to estimate this value could be by load testing a similar service instance. The default value is 100.
rounds_to_average (r)
This refers to the number of ‘autoscaling decision making’ task iterations that need to take place before an autoscaling decision is made. The default value is 10.
alarming_upper_rate (AUR)
Instead of waiting till the service instance reaches its maximum request capacity (where the alarming_upper_rate = 1), the system will be scaled up when it reaches the request capacity that corresponds to the alarming_upper_rate. This value should be 0<AUR<=1. The default value is 0.7.
alarming_lower_rate (ALR)
This refers to the lower bound of the alarming rate. This variable an be used as an indicator to decide when the system should be scaled down. This value should be 0<ALR<=1. The default value is 0.2.
scale_down_factor (SDF)
This refers to the factor that is needed to slow down the scaling down process. To reduce the scaling down due to a false-positive events, the scaling do wn process needs to be carried out slowly. This value should be 0<SDF<=1. The default value is 0.25.

Method of calculating the number of requests in-flight

The number of requests that come to the Elastic Load Balancer (ELB) for various service clusters are tracked. A token is added for each incoming request against the relevant service cluster and the corresponding token is removed when a message has left the ELB or when the message expires. Thereby, the requests in-flight are calculated based on the token.

Decision making functions

The minimum number of instances and the maximum number of instances of service clusters are always respected. The system always maintains the minimum number of service instance requirements and the system will not scale beyond it's limit.

Calculations

The following is the calculation that is used to calculate the actual load based on the number of requests in-flight:

Average requests in-flight for a particular service cluster (avg) = total number of requests in-flight * (1/r)

Scaling Up

The following is the calculation that is used to calculate the expected number of maximum requests that a service instance can withstand. This in-turn is used to decide when the scaling up process should be carried out:

Number of maximum requests that a service instance can withstand over an autoscaler task interval (maxRpt) = (Rps) * (t/1000) * (AUR)

then, we decide to scale up, if,

avg > maxRpt * (number of running instances of this service cluster)

Scaling down

The following is the calculation that is used to calculate the imaginary lower bound value. This in-turn is used to decide when the scaling down process should be carried out:

Imaginary lower bound value (minRpt) = (Rps) * (t/1000) * (ALR) * (SDF)

then, we decide to scale down, if,

avg < minRpt * (number of running instances of this service cluster - 1)

Plugging in own implementation

Follow the steps mentioned below to plug-in your own implementation:

You can write your own Java implementation that implements
org.apache.synapse.task.Task and org.apache.synapse.ManagedLifecycle interfaces.
Wrap the implementation class to an OSGi bundle and deploy in WSO2 ELB.

Point to that class from the {ELB_HOME}/repository/conf/loadbalancer.conf file’s loadbalancer section as follows:

loadbalancer {
…....
# autoscaling decision making task
autoscaler_task  org.wso2.carbon.mediator.autoscale.lbautoscale.task.ServiceRequestsInFlightAutoscaler;
…...
}