Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This section describes some recommended performance tuning configurations to optimize the performance of WSO2 DAS. It assumes that you have set up WSO2 DAS on a server running Unix/Linux, which is recommended for a production deployment.

Table of Contents
maxLevel4

...

PropertyDescriptionRecommendation
maxActive

The maximum number of active connections that can be allocated from the connection pool at the same time. The default value is  100.

This value should match the maximum number of requests that can be expected at a time in your production environment. This is to ensure that, whenever there is a sudden increase in the number of requests to the server, all of them can be connected successfully without causing any delays. Note that this value should not exceed the maximum number of requests allowed for your database.
minIdleThe minimum number of connections that can remain idle in the pool, without extra ones being created. The connection pool can shrink below this number if validation queries fail. Default value is 0.This value should be similar or near to the average number of requests that will be received by the server at the same time. With this setting, you can avoid having to open and close new connections every time a request is received by the server.
testOnBorrow

The indication of whether connection objects will be validated before they are borrowed from the pool. If the object validation fails, it will be dropped from the pool, and we will attempt to borrow another connection.

Setting this property to 'true' is recommended as it will avoid connection requests from failing. The validationQuery property should be used if testOnBorrow is set to true. To increase the efficiency of connection validation and to improve performance, validationInterval property should also be used.

validationInterval

To avoid excess validation, run validation at most at this frequency (time in milliseconds). If a connection is due for validation, but has been validated previously within this interval, it will not be validated again. The default value is  30000  (30 seconds).

This time out can be as high as the time it takes for your DBMS to declare a connection as stale. For example, MySQL will keep a connection open for as long as 8 hours, which requires the validation interval to be within that range. However, note that having a low value for validation interval will not incur a big performance penalty, specially when database requests have a high throughput. For example, a single extra validation query run every 30 seconds is usually negligible.

validationQueryThe SQL query used to validate connections from this pool before returning them to the caller. If specified, this query does not have to return any data, it just can't throw an SQLException. The default value is null. Example values are SELECT 1(mysql), select 1 from dual(oracle), SELECT 1(MS Sql Server).Specify an SQL query, which will validate the availability of a connection in the pool. This query is necessary when testOnBorrow property is true.

...

PropertyDescriptionRecommendation
workerThreads

Number of threads in consumer to handle the load. Default value is 10.

Value should be higher when receiving throughput is high. Should to consider number of CPU cores.
eventBufferCapacitySize of the receiving event buffer. Default value is 10000.Needs to be higher value when receiving throughput is high. When increasing the value heap memory size also needs to be increased accordingly.
Publishing events

The following parameters which affect the performance relating to publishing events are configured in the <DAS_HOME>/repository/conf/data-bridge/data-agent-config.xml file. These configurations are common for both thrift and binary protocols.

...

Parameters to be configured are as follows.

Cores 

ParameterDefault ValueDescription
spark.executor.coresAll the available cores on the worker.The number of cores to use on each executor. Setting this parameter allows an application to run multiple executors on the same worker, provided that there are enough cores on that worker. Otherwise, only one executor per application is run on each worker.
spark.cores.maxInt.MAX_VALUEThe maximum amount of CPU cores to request for the application from across the cluster (not from each machine).
spark.worker.cores1The number of cores assigned for a worker.

...

Here, there are resources for 16 executors with 16 cores and 48GB of memory. With the spark.cores.max = 12 (i.e. 3 x 4), 12 executors will be are assigned to the carbon application and the rest of the cores and memory can be assigned for to another spark Spark application (i.e. 4 cores and 12 GB will be are available in the cluster and that can be used by the application, depending on its preference).