Performance Tuning
This section describes some recommended performance tuning configurations to optimize the performance of WSO2 Product SP. It assumes that you have set up WSO2 Product SP on a server running Unix/Linux, which is recommended for a production deployment.
Important
- Performance tuning requires you to modify important system files, which affect all programs running on the server. We recommend you to familiarize yourself with these files using Unix/Linux documentation before editing them.
- The parameter values we discuss below are just examples. They might not be the optimal values for the specific hardware configurations in your environment. We recommend that you carry out load tests on your environment to tune the product accordingly.
OS-Level Settings
To optimize network and OS performance, configure the following settings in
/etc/sysctl.conf
file of Linux. These settings specify a larger port range, a more effective TCP connection timeout value, and a number of other important parameters at the OS-level.net.ipv4.tcp_fin_timeout = 30 fs.file-max = 2097152 net.ipv4.tcp_tw_recycle = 1 net.ipv4.tcp_tw_reuse = 1 net.core.rmem_default = 524288 net.core.wmem_default = 524288 net.core.rmem_max = 67108864 net.core.wmem_max = 67108864 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 net.ipv4.ip_local_port_range = 1024 65535
When we have the localhost port range configuration lower bound to 1024, there is a possibility that some processes may pick the ports which are already used by WSO2 servers. Therefore, it's good to increase the lower bound as sufficient for production, e.g., 10,000.
To alter the number of allowed open files for system users, configure the following settings in /etc/security/limits.conf file of Linux.
* soft nofile 4096 * hard nofile 65535
Optimal values for these parameters depend on the environment.
To alter the maximum number of processes your user is allowed to run at a given time, configure the following settings in
/etc/security/limits.conf
file of Linux (be sure to include the leading * character). Each carbon server instance you run would require upto 1024 threads (with default thread pool configuration). Therefore, you need to increase the nproc value by 1024 per each carbon server (both hard and soft).* soft nproc 20000 * hard nproc 20000
JVM settings
The recommended JVM memory allocation is Xmx4g
and Xms2g
.
When an XML element has a large number of sub-elements and the system tries to process all the sub-elements, the system can become unstable due to a memory overhead. This is a security risk.
To avoid this issue, you can define a maximum level of entity substitutions that the XML parser allows in the system. You do this using the entity expansion limit
attribute that is in the <SP_HOME>/bin/editor.bat
file (for Windows) or the <SP_HOME>/bin/editor.sh
file (for Linux/Solaris). The default entity expansion limit is 64000.
-DentityExpansionLimit=100000
JDBC Pool Configuration
Within the WSO2 platform, we use Tomcat JDBC pooling as the default pooling framework due to its production ready stability and high performance. The table below indicates some recommendations on how to configure the JDBC pool using the <PRODUCT_HOME>/repository/conf/datasources/master-datasources.xml file
.
Property | Description | Recommendation |
---|---|---|
maxActive | The maximum number of active connections that can be allocated from the connection pool at the same time. The default value is | This value should match the maximum number of requests that can be expected at a time in your production environment. This is to ensure that, whenever there is a sudden increase in the number of requests to the server, all of them can be connected successfully without causing any delays. Note that this value should not exceed the maximum number of requests allowed for your database. |
testOnBorrow | The indication of whether connection objects will be validated before they are borrowed from the pool. If the object validation fails, it will be dropped from the pool, and we will attempt to borrow another connection. | Setting this property to 'true' is recommended as it will avoid connection requests from failing. The |
validationInterval | To avoid excess validation, run validation at most at this frequency (time in milliseconds). If a connection is due for validation, but has been validated previously within this interval, it will not be validated again. The default value is | This time out can be as high as the time it takes for your DBMS to declare a connection as stale. For example, MySQL will keep a connection open for as long as 8 hours, which requires the validation interval to be within that range. However, note that having a low value for validation interval will not incur a big performance penalty, specially when database requests have a high throughput. For example, a single extra validation query run every 30 seconds is usually negligible. |
validationQuery | The SQL query used to validate connections from this pool before returning them to the caller. If specified, this query does not have to return any data, it just can't throw an SQLException. The default value is null. Example values are SELECT 1(mysql), select 1 from dual(oracle), SELECT 1(MS Sql Server). | Specify an SQL query, which will validate the availability of a connection in the pool. This query is necessary when testOnBorrow property is true. |
When it comes to web applications, users are free to experiment and package their own pooling framework such BoneCP.
SP-Level settings
Performance tuning can be tried out in the following areas at the SP level. The performance is considered in terms of throughput per second (TPS) and latency.
Receiving events
The following parameters which affect the performance relating to the databridge communication are configured in the <SP_HOME>/conf/editor/deployment.yaml
under data-bridge-config property . These configurations are common for both thrift and binary protocols.
Property | Description | Default Value | Recommendation |
---|---|---|---|
workerThreads
| The number of threads reserved to handle the load of events received. | 10 | This value should be increased if you want to increase the throughput by receiving a higher number of events at a given time. The number of available CPU cores should be considered when specifying this value. If the value specified exceeds the number of CPU cores, higher latency would occur as a result of context switching taking place more often. |
maxEventBufferCapacity | The maximum size allowed for the event receiving buffer in bytes. The event receiving buffer temporarily stores the events received before they are forwarded to an event stream. | 10000000 | This value should be increased when there is an increase in the receiving throughput. When increasing the value heap memory size also needs to be increased accordingly. |
eventBufferSize | The number of messages that is allowed in the receiving queue at a given time. | 2000 | This value should be increased when there is an increase in the receiving throughput. |
clientTimeoutMin | Session timeout value in minutes. | 30 | Cache that contains all the agent sessions are expired after this value is reached. This value should be increased when there is an |
Publishing events
The following parameters which affect the performance relating to the Data Agents - to publish events through databridge are configured in the <SP_HOME>/conf/editor/deployment.yaml
under data.agent.config property. These configurations are common for both thrift and binary protocols.
Property | Description | Default Value | Recommendation |
---|---|---|---|
queueSize
| The size of the queue event disruptor which handles events before they are published to an application/data store. | 32768 | The value specified should always be the result of an exponent with 2 as the base. (e.g., 32768 is 215). A higher value should be specified when a higher throughput needs to be handled. However, the increase in the load handled at a given time can reduce the speed at which the events are processed. Therefore, a lower value should be specified if you want to reduce the latency. |
batchSize
| The maximum number of events in a batch sent to the queue event disruptor at a given time. | 200 | This value should be assigned proportionally to the throughput of events handled. Greater the batch size, higher will be the number of events sent to the queue event disruptor at a given time. |
corePoolSize
| The number of threads that will be reserved to handle events at the time you start the CEP server. This value will increase as throughput of events handled increases, but it will not exceed the value specified for the MaxPoolSize parameter. | 1 | The number of available CPU cores should be taken into account when specifying this value. Increasing the core pool size may improve the throughput, but latency will also be increased due to context switching. |
maxPoolSize
| The maximum number of threads that should be reserved at any given time to handle events. | 1 | The number of available CPU cores should be taken into account when specifying this value. Increasing the maximum core pool size may improve the throughput since more threads can be spawned to handle an increased number of events. However, latency will also increase since a higher number of threads would cause context switching to take place more frequently. |
For better througput you can configure the parameters as follows.
queueSize: 32768 batchSize: 200 corePoolSize: 1 socketTimeoutMS: 30000 maxPoolSize: 1 keepAliveTimeInPool: 20 reconnectionInterval: 30 maxTransportPoolSize: 250 maxIdleConnections: 250 evictionTimePeriod: 5500 minIdleTimeInPool: 5000 secureMaxTransportPoolSize: 250 secureMaxIdleConnections: 250 secureEvictionTimePeriod: 5500 secureMinIdleTimeInPool: 5000
For reduced latency, you can configure the parameters as follows.
<QueueSize>256</QueueSize> <BatchSize>200</BatchSize> <CorePoolSize>1</CorePoolSize> <MaxPoolSize>1</MaxPoolSize>