WSO2 API-M Performance and Capacity Planning

Download the latest Performance Test Results for WSO2 API Manager 2.1.0 from here. If you need any additional information, please contact us.

The following sections analyze the results of WSO2 API Manager performance tests done in the Amazon EC2 environment.

Deployment

The following are the details with regard to the WSO2 API-M 2.1.0 deployment that is depicted above.

Four (4) EC2 t2.xlarge instances are used for deployment as shown above. Each instance has 4 CPUs and 16GB of memory.
The operating system is Ubuntu 16.04.2 LTS.
Apache JMeter version is 3.2. is used in this deployment.
Apache JMeter is the preferred load testing tool at WSO2. As there are a high number of concurrent users, we have increased the number of sockets supported in the server.
- The following commands were used in the instance used by JMeter.
```
sudo sysctl -w net.ipv4.ip_local_port_range="1025 65535"
sudo sysctl -w net.ipv4.tcp_tw_reuse=1
```
- The first command increases the port range used for the client.
- The second command allows to reuse the sockets in the TIME_WAIT state. These are safe commands that can be used in the client side.
  For more information, go to https://vincent.bernat.im/en/blog/2014-tcp-time-wait-state-linux
The ulimit was increased in all servers.
For more information, see Tuning Performance.

Backend Service

The backend service used for testing was developed using Netty. Since there can be up to 3000 users, we used 3000 threads for Netty in order to avoid any bottlenecks in the backend. The Netty server also has a parameter that can be used to simulate the delays in the backend service by simply specifying the sleep time in seconds.

Test Scenario

The test scenario focuses on performing an API proxy invocation using API Manager that in turn will echo the API. Tests were done using 100, 200, 300, 1000, 2000, and 3000 concurrent users.

Measuring Performance

Two key performance metrics are used to measure the performance, namely Latency and Throughput. Throughput measures the number of messages that a server processes during a specific time interval (e.g., per second). The throughput is calculated using the following equation.

Throughput = Number of requests / Time to complete the requests

Latency measures the end-to-end processing time for an operation. Every operation has its own latency. Therefore, we are interested in how latency behaves. In order to see this behavior, we must have the complete distribution of latencies.

Performance Testing Tool

In Apache JMeter we specified the number of concurrent users, ran the test, and got the results specified under the Performance Test Results section. The following section provides details of the for the terminology used in the Performance Test Results section.

Error Count - How many request errors were recorded.
Error % - Percent of requests with errors
Average - The average response time of a set of results
Min - The shortest time taken for a request
Max - The longest time taken for a request
90th Percentile - 90% of the requests took no more than this time. The remaining samples took at least as long as this.
95th Percentile - 95% of the requests took no more than this time. The remaining samples took at least as long as this.
99th Percentile - 99% of the requests took no more than this time. The remaining samples took at least as long as this.
Throughput - The throughput is measured in requests per second.
Received KB/sec - The throughput is measured in received Kilobytes per second.
Sent KB/sec - The throughput is measured in sent Kilobytes per second.

Think time is the delay between two requests. For example, a think time of 1 second means that when there are 2000 concurrent users, each user will send the next request 1 second after he/she gets a response for the previous request. In a realistic scenario there will always be more think-time in between requests and with that, the performance numbers obtained will be much better.

Measuring latency using a single request (e.g., sending a cURL request while the load test is running) may not be useful. This is why we keep the latencies of each request and get statistics from the complete distribution of latencies.

In addition, to the above details, we also obtained some additional details for every test done for a given number of concurrent users.

Load Average - The load average of the system was taken from the sar (System Activity Report) output. We took the maximum load average reported during the test period.
GC Throughput - Time percentage the application was not busy with garbage collection (GC).
Total heap usage - Max memory usage in total reserved heap.
Allocated Max - Max memory allocated for the heap by the Java Virtual Machine (JVM).
Heap Usage - Total heap usage as a percentage of maximum allocated heap.
Max heap after full GC - Maximum size of live objects used by the application.
Max heap after full GC % - The maximum size of live objects as a percentage of maximum allocated heap.

Last 6 details were obtained from the GC logs produced by the WSO2 API Manager.

The following are the GC flags that were used.

-XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -Xloggc:"$CARBON_HOME/repository/logs/gc.log"

The process memory was also obtained for each test. However, we did not include those values as Java is working on an already reserved heap area.

Performance Tuning of WSO2 API Manager

The heap of WSO2 API Manager was increased to 4GB and that is the only change done for WSO2 API Manager, which had an impact on the performance.

When doing performance tests, the heap memory was initially set at 2GB as this is the recommended heap memory for WSo2 API Manager. However, when the tests were done with large numbers of concurrent users using 2GB heap, the GC throughput observed was less than 90%. We recommend that the GC throughput to be maintained above 90% and therefore we increased the heap memory to 4GB and ran a similar set of tests.

We also increased the socket timeouts in WSO2 API Manager. However, it was not needed as the performance results of WSO2 API Manager never reported latencies greater than 60 seconds, which is the default socket timeout.

The number of worker threads used were 400 (default value) and the number of threads may increase up to 500 (default value).

Performance Test Results

Scenario 1: Echo API in WSO2 API Manager

There is no delay in the backend service. This test was done mainly to see the maximum throughput of the WSO2 API Manager in the EC2 environment.

Scenario 2: Echo API in WSO2 API Manager and 1 second Think-Time

The test mentioned in Scenario 1 was repeated with a 1 second think-time. This means that there is a one second gap in between the requests sent by a user (in JMeter). In a realistic scenario, there will always be some think-time in between requests. This test was done to understand the performance of WSO2 API Manager when there is a think time.

There is no delay in the backend service.

Comparison - Echo API with no think-time vs. 1sec think-time

Throughput with 1 second think-time is much less than the throughput without the think time. Adding a think-time reduces the arrival rate of the requests.

Latencies of requests are better as the arrival rate of requests is low.

Load average is also much better due to the low arrival rate of requests.

Conclusion

We analyzed the performance of WSO2 API Manager using 100, 200, 300, 1k, 2k, and 3k threads using 1, 5, and 30 second backend delay.

Except for increasing the heap size of API Manager, there were no other specific optimization techniques (i.e., performance tuning) used to optimize the performance of WSO2 API Manager. We only increased the heap size to 4GB from 2GB, which is the default.

When there is no “think-time” added after a request, it means that for example when there are 2000 concurrent users, each user will send the next request as soon as she gets a response for the previous request. In a realistic scenario, there will always be some think-time in between requests. Therefore, with that the latency numbers obtained are much better. The load average is also much better with think-time.