This section summarizes the results of performance tests carried out with the minimum fully distributed DAS deployment setup with RDBMS (MySQL) and HBase event stores separately.

Table of Contents

indent	20px

...

This test involved setting up a 10-node HBase cluster with HDFS as the undrelying file system.

Versions

WSO2 DAS 3.1.0.
Apache Hadoop 2.7.2.
Apache HBase 1.2.1.
Oracle Java Development Kit (JDK) v1.7 update 51 (1.7.0_51-b13).

Infrastructure used

3 DAS nodes (variable roles: publisher, receiver, analyzer and indexer): c4.2xlarge
1 HBase master and Hadoop Namenode: c3.2xlarge
9 HBase Regionservers and Hadoop Datanodes: c3.2xlarge

...

This test was designed to test the data layer during sustained event publication. During testing, the TPS was around the 150K mark, and the memstore flush of the HBase cluster (which suspends all writes) and minor compaction operations brought it down in bursts. Overall, a mean of 96K TPS was achieved, but a steady rate of around 100-150K TPS as is achievable, as opposed to the current no-flow-control situation.

...

Events	16753779
Time (s)	1862.901
Mean TPS	8993.381291

...

Microsoft SQL Server Event Store

Infrastructure used

c4.2xlarge Amazon EC2 instances as the DAS nodesnode
- Linux kernel 4.44, java version "1.8.0_131", JVM flags : -Xmx4g -Xms2g
db.m4.2xlarge Amazon RDS instance as with MS SQL Server Enterprise Edition 2016 as the database node
Customized Thrift client as the data publisher (Thrift producer found in samples)

...

c4.2xlarge Amazon EC2 instances as the DAS nodesnode
- Linux kernel 4.44, java version "1.8.0_131", JVM flags : -Xmx4g -Xms2g
db.m4.2xlarge Amazon RDS instance with MySQL Community Edition version 5.7 as the database node
Customized Thrift client as the data publisher (Thrift producer found in samples)

Scenario: Persisting 12 million events of Process Monitoring Events on

...

MySQL

This test involved persisting process monitoring events of approximately 180 bytes. The test injected 12 million events into DAS with an input TPS of 10,000 events/second.

...

Info

title	MySQL Upper Limit

After around 12 million events are published, a sudden drop can be observed in receiver performance that can be considered as the upper limit of MySQL event store. In order to continue receiving events without a major performance degradation data has to be purged periodically before it reaches the upper limit. See https://docs.wso2.com/display/DAS310/Purging+Data for more information on configuring data purging.

Another type of event store such as In the event that data purging is not possible an HBase event store should be used when data purging is not possible.

Batch Analytics

The following topics describe the analyzer performance of WSO2 DAS.

Scenario: Running Spark queries on the 1 billion smart home published events

...

Over 1 million TPS on Spark for 2 analyzers
About 1.3 million TPS for 3 analysersanalyzers.

The DAS GET operations (on HBase) make use of the HBase data locality aspect. This has the potential to perform the GET operations fast compared to random access.

...

Scenario: Running Spark queries on 1 million smart home and Wikipedia events on MySQL Event Store

The following topics describe the analyzer performance of WSO2 DAS. Spark analyzing performance (time to complete execution) was measured using a 2 node DAS analyzer cluster with MySQL database.

...

Data set	Event Count	Query Type	Time Taken (seconds)
Smart Home	10000000	`INSERT OVERWRITE TABLE cityUsage SELECT metro_area, avg(power_reading) AS avg_usage, min(power_reading) AS min_usage, max(power_reading) AS max_usage FROM smartHomeData GROUP BY metro_area`	26 sec
Smart Home	10000000	`INSERT OVERWRITE TABLE peakDeviceUsageRange SELECT house_id, (max(power_reading) - min(power_reading)) AS usage_range FROM smartHomeData WHERE is_peak = true AND metro_area = "Seattle" GROUP BY house_id`	22 sec
Smart Home	10000000	`INSERT OVERWRITE TABLE stateAvgUsage SELECT state, avg(power_reading) AS state_avg_usage FROM smartHomeData`	21 sec
Smart Home	10000000	`INSERT OVERWRITE TABLE stateUsageDifference SELECT a2.state, (a2.state_avg_usage-a1.overall_avg) AS avg_usage_difference FROM (select avg(state_avg_usage) as overall_avg from stateAvgUsage) as a1 join stateAvgUsage as a2`	`1 sec`
Wikipedia	10000000	`INSERT INTO TABLE wikiAvgArticleLength SELECT AVG(length) as avg_article_length FROM wiki`	48 min
Wikipedia	10000000	`INSERT INTO TABLE wikiContributorSummary SELECT contributor_username, COUNT(*) as page_count FROM wiki GROUP BY contributor_username`	1 hour 45 min
Wikipedia	10000000	`INSERT INTO TABLE wikiTotalArticleLength SELECT SUM(length) as total_article_chars FROM wiki`	44 min
Wikipedia	10000000	`INSERT INTO TABLE wikiTotalArticlePages SELECT COUNT(*) as total_pages FROM wiki`	1 hour 17 min

Single Node Local Clustered Setup Statistics

A fully distributed setup was tested locally with multiple JVMs, and with the following hardware infrastructure specifications.

Machine type: Laptop

RAM: 8GB

Processor: Intel(R) Core(TM) i7-3520M

Storage: Samsung SSD 850

The setup is a 2 node analyzer cluster with a MySQL database as the event store. Analyzer statistics (i.e., the time duration for each execution of the query) are given below.

25 15901127

Dataset	Event Count	Query Type	Time Taken
Wikipedia	15901127	`INSERT INTO TABLE wikiContributorSummary SELECT contributor_username, COUNT(*) as page_count FROM wiki GROUP BY contributor_username`	min	Wikipedia	`INSERT INTO TABLE wikiTotalArticleLength SELECT SUM(length) as total_article_chars FROM wiki`	25 min
Wikipedia	15901127	`INSERT INTO TABLE wikiTotalArticlePages SELECT COUNT(*) as total_pages FROM wiki`	25 min
Wikipedia	15901127	`INSERT INTO TABLE wikiAvgArticleLength SELECT AVG(length) as avg_article_length FROM wiki`	25 min

It was observed that the performance here is comparatively higher (taking into account that the setup consists of a single machine). This is mainly due to the DAS server and MySQL existing locally, and having no physical network I/O delays as a result. This allows the queries to be executed in an optimal manner.

Indexing Performance

In the following table, the shardIndexRecordBatchSize indicates the amount of index data (in bytes) to be processed at a time by a shard index worker.

...

Retrieving Results

Scenario: Retrieving Process Monitoring Data via REST API

This test was conducted on a test setup as shown in the following figure,

Image Removed

Infrastructure used

JMeter, DAS 3.1.0, MySQL: c4.xlarge (4 vCPUs, 7.5 GB, EBS-Only, 750 Mbps network)c4.xlarge Amazon EC2 instances as the DAS node
- Linux kernel 4.44, java version "1.8.0_131", JVM flags : -Xmx4g -Xms2g
, MySQL
c4.xlarge Amazon EC2 instances MySQL Community Edition version 5.7 as the database node

This test was conducted on a test setup as shown in the following figure,

Image Added

Using JMeter the DAS’s search REST API was invoked. Eighty JMeter users were used and they sent requests in a tight loop. The request was sent to query a single record from an event table via the DAS search API. In this experiment, the MySQL server’s event table had data which had been loaded in previous experiments. The experiment was run for 45 minutes. Average throughput value of 2,695 events 2839 events/second and an average latency of 29 ms was measured at the JMeter.

API invocations per second	2839
Average Latency (ms)	29

Versions Compared

Old Version 47

New Version Current

Key

Microsoft SQL Server Event Store

Scenario: Persisting 12 million events of Process Monitoring Events on

MySQL

Batch Analytics

Scenario: Running Spark queries on the 1 billion smart home published events

Scenario: Running Spark queries on 1 million smart home and Wikipedia events on MySQL Event Store

Retrieving Results

Scenario: Retrieving Process Monitoring Data via REST API

Page Comparison

Versions Compared

Old Version 47

New Version Current

Key

Microsoft SQL Server Event Store

Scenario: Persisting 12 million events of Process Monitoring Events on

MySQL

Batch Analytics

Scenario: Running Spark queries on the 1 billion smart home published events

Scenario: Running Spark queries on 1 million smart home and Wikipedia events on MySQL Event Store

Retrieving Results

Scenario: Retrieving Process Monitoring Data via REST API