Autoscaling User Guide

Providing autoscaling related parameters

A set of parameters are provided to calibrate the elasticity of the system. This is done in the following manner based on the type of service:

WSO2 Carbon services
Define the autoscaling related parameters before WSO2 Elastic Load Balancer (ELB) starts up. (We will support the dynamic behavior for Carbon servers soon).
Non-Carbon services
Define autoscaling parameters when subscribing to a Cartridge.

Description on autoscaling related parameters

The configuration of the autoscaling related parameters vary based on the type of the service. The autoscaling related parameters for Carbon services are configured using the loadbalancer.conf file, while non-Carbon services are configured at the time of subscription.

autoscaler_task_interval (t)
The time period between two iterations of an ‘autoscaling decision making’ task. You are advised to consider the time ‘that a service instance takes to join ELB’ when configuring this value. This is in milliseconds and the default value is 30000ms.
max_requests_per_second (Rps)
The number of requests, a service instance can withstand per a second. It is recommended to calibrate this value for each service instance and if needed for different scenarios. Load testing a similar service instance is an ideal way to estimate the value. The default value is 100.
rounds_to_average (r)
An autoscaling decision will be made only after this many iterations of ‘autoscaling decision making’ tasks. The default value is 10.
alarming_upper_rate (AUR)
The upper bound of the alarming rate that provides a hint on when to scale up the system. We scale the system up when it reaches the request capacity that corresponds to alarming_upper_rate, without waiting till the service instance reach its maximum request capacity. This value should be 0<AUR<=1 and default is 0.7.
alarming_lower_rate (ALR)
This is the lower bound of the alarming rate that provides a hint on when to scale down the system. This value should be 0<ALR<=1 and default is 0.2.
scale_down_factor (SDF)
This factor is needed to make the scaling down process slow. Scaling down needs to be done slowly to reduce scaling down due to a false-positive events. This value should be 0<SDF<=1 and default is 0.25.

Service instances

Setting limits to service instances

It is possible to set a limit to the number of service instances that are maintained in the system at any given time. It is done by setting the min_app_instances parameter for any service cluster and the autoscaler will ensure that the system will not scale down below that even though there is no considerable service requests in-flight.

Control over Instances

The user can set the max_app_instances parameter for any service cluster to control the number of instances that the autoscaler can start. The autoscaler ensures that the system will not scale up above the limit specified even though there is a high load of requests in-flight. When you pay for the instances that you start-up it is very useful to set the max_app_instances parameter .

Autoscaling test

The following is a simple autoscaling test based on a PHP Cartridge:

Load the Stratos2 CLI tool and subscribe to the PHP Cartridge as follows:
```
stratos>subscribe <cartridge-type> <alias> --policy <policy-name>
```
Example:
```
stratos>subscribe php nirmalphp --policy elastic
```
Execution of the above command will result in starting up a PHP service instance with a GIT repo.
Push a PHP application to the GIT repository, created just for you.
You can add a PHP application for testing purposes that does nothing other than sleep for 30 seconds. Seconds after committing your app, you should be able to access it.
Write a small JMeter test script to load your PHP application.
After awhile you should notice that the nodes are scaling up (provided you loaded the PHP application heavily). You should also notice, when the load test is over the extra nodes scaling down.

Sample Configurations

Please see Sample Autoscaling Configurations.