Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

This section summarizes the results of performance tests carried out with the minimum fully distributed DAS deployment setup with RDBMS (MySQL) and HBase event stores separately.

DAS Performance Test Round 1:

...

Infrastructure used

  • c4.2xlarge Amazon EC2 instances as the DAS nodes
  • One DAS node as the publisher
  • A c3.2xlarge Amazon instance as the database node

Receiver node Data Persistence Performance

A reduction in the throughput is observed as shown below after 1200000 events in both DAS receiver nodes. This reduction is caused by limitations of MySQL. The receiver performance variation of the second node of the 2 node receiver cluster is as given below. As the initial buffer filling in receiver queues give very high receiver performance at the beginning of the event publishing, event rate after the first 1200000 events was considered for the following graph.

Image Removed

 

Info
titleMySQL Breakpoint

After around 30 million events are published, a sudden drop can be observed in receiver performance that can be considered as the break point of MySQL event store. Another type of event store such as HBase event store should be used when the receiver performance has to be maintained unchanged.

 

Testing with large events

The following results were obtained by testing the two-node HA (High Availability) DAS cluster with 10 million events published via the Analyzing Wikepedia Data sample. Each event in this sample contains several kilobytes that represent large events.

Image Removed

In the above graph, TPS represents the total number of events published per second. This stabilizes at about 8500 events per second.

Image Removed

The above graph shows the amount of data that is published per second (referred to as the data rate). The data rate published is significantly reduced at the initial stages due to the flow control mechanisms of the receiver. It stabilizes at around 25 MB per second.

 

With MySQL RDBMS event store

DAS data persistence was measured by publishing to 2 load balanced receiver nodes with MySQL database.

SampleNumber of EventsMean Event Rate

Smart Home sample

1000000005741 events per second

Wikipedia sample

159011274438 events per second

 

Analyzer Performance

The following topics describe the analyzer performance of WSO2 DAS.

With MySQL RDBMS event store

Spark analyzing performance (time to complete execution) was measured using a 2 node DAS analyzer cluster with MySQL database.

Time taken for each type of Spark query is given below.

Data setEvent CountQuery TypeTime Taken (seconds)
Smart Home10000000INSERT OVERWRITE TABLE cityUsage SELECT metro_area, avg(power_reading) AS avg_usage, min(power_reading) AS min_usage, max(power_reading) AS max_usage FROM smartHomeData GROUP BY metro_area 26 sec
Smart Home10000000INSERT OVERWRITE TABLE peakDeviceUsageRange SELECT house_id, (max(power_reading) - min(power_reading)) AS usage_range FROM smartHomeData WHERE is_peak = true AND metro_area = "Seattle" GROUP BY house_id 22 sec
Smart Home10000000INSERT OVERWRITE TABLE stateAvgUsage SELECT state, avg(power_reading) AS state_avg_usage FROM smartHomeData21 sec
Smart Home10000000INSERT OVERWRITE TABLE stateUsageDifference SELECT a2.state, (a2.state_avg_usage-a1.overall_avg) AS avg_usage_difference FROM (select avg(state_avg_usage) as overall_avg from stateAvgUsage) as a1 join stateAvgUsage as a2  1 sec
Wikipedia10000000INSERT INTO TABLE wikiAvgArticleLength SELECT AVG(length) as avg_article_length FROM wiki48 min
Wikipedia10000000INSERT INTO TABLE wikiContributorSummary SELECT contributor_username, COUNT(*) as page_count FROM wiki GROUP BY contributor_username1 hour 45 min
Wikipedia10000000INSERT INTO TABLE wikiTotalArticleLength SELECT SUM(length) as total_article_chars FROM wiki44 min
Wikipedia10000000INSERT INTO TABLE wikiTotalArticlePages SELECT COUNT(*) as total_pages FROM wiki1 hour 17 min

 

...

HBase Cluster

This test involved setting up a 10-node HBase cluster with HDFS as the undrelying file system.

Infrastructure used

  • 3 DAS nodes (variable roles: publisher, receiver, analyzer and indexer): c4.2xlarge
  • 1 HBase master and Hadoop Namenode: c3.2xlarge
  • 9 HBase Regionservers and Hadoop Datanodes: c3.2xlarge

...


Persisting 1 billion events from the Smart Home DAS Sample

...

The published data took around 950GB on the Hadoop filesystem, taking the HDFS-level replication into account. 


 


 Events1000000000
Time (in seconds)10391.768
Mean TPS96230.01591

 


Persisting the entire Wikipedia corpus

This test involved publishing the entirety of the Wikipedia dataset, where a single event comprises of one Wikipedia article (16.8M articles in total). Events vary greatly in size, with the mean being ~3.5KB. Here, a mean throughput of around 9K TPS was observed.

 


 


Events16753779
Time (s)1862.901
Mean TPS8993.381291

...


Running Spark queries on the 1 billion published events

...

The DAS GET operations (on HBase) make use of the HBase data locality aspect. This has the potential to perform the GET operations fast compared to random access.

Query2 Analyzers3 Analyzers
Time(s)Mean TPSTime(s)Mean TPS
INSERT OVERWRITE TABLE cityUsage SELECT metro_area, avg(power_reading) AS avg_usage,min(power_reading) AS min_usage, max(power_reading) AS max_usage FROM smartHomeData GROUP BY metro_area958.801042968.20741.151349250.90
INSERT OVERWRITE TABLE ct SELECT count(*) FROM smartHomeData953.461048806.20734.99

1360570.13

INSERT OVERWRITE TABLE peakDeviceUsageRange SELECT house_id, (max(power_reading) - min(power_reading)) AS usage_range FROM smartHomeData WHERE is_peak = true AND metro_area = "Seattle" GROUP BY house_id

975.06

1025581.77751.271331073.47
 
INSERT OVERWRITE TABLE stateAvgUsage SELECT state, avg(power_reading) AS state_avg_usage FROM smartHomeData GROUP BY state
991.081009003.34783.54

1276265.545

 


Running Spark queries on the Wikipedia corpus

Query2 Analyzers3 Analyzers
Time(s)Mean TPSTime(s)Mean TPS
INSERT INTO TABLE wikiAvgArticleLength SELECT AVG(length) as avg_article_length FROM wiki222.7075234.03167.27100164.18
INSERT INTO TABLE wikiTotalArticleLength SELECT SUM(length) as total_article_chars FROM wiki221.7475554.76166.92100373.80
INSERT INTO TABLE wikiTotalArticlePages SELECT COUNT(*) as total_pages FROM wiki221.8075536.05166.14100842.18
INSERT INTO TABLE wikiContributorSummary SELECT contributor_username, COUNT(*) as page_count FROM wiki GROUP BY contributor_username236.1170958.52181.4292350.26


DAS Performance Test Round 2: RDBMS

Infrastructure used

  • c4.2xlarge Amazon EC2 instances as the DAS nodes
  • One DAS node as the publisher
  • A c3.2xlarge Amazon instance as the database node

Receiver node Data Persistence Performance

A reduction in the throughput is observed as shown below after 1200000 events in both DAS receiver nodes. This reduction is caused by limitations of MySQL. The receiver performance variation of the second node of the 2 node receiver cluster is as given below. As the initial buffer filling in receiver queues give very high receiver performance at the beginning of the event publishing, event rate after the first 1200000 events was considered for the following graph.

Image Added


Info
titleMySQL Breakpoint

After around 30 million events are published, a sudden drop can be observed in receiver performance that can be considered as the break point of MySQL event store. Another type of event store such as HBase event store should be used when the receiver performance has to be maintained unchanged.


Testing with large events

The following results were obtained by testing the two-node HA (High Availability) DAS cluster with 10 million events published via the Analyzing Wikepedia Data sample. Each event in this sample contains several kilobytes that represent large events.

Image Added

In the above graph, TPS represents the total number of events published per second. This stabilizes at about 8500 events per second.

Image Added

The above graph shows the amount of data that is published per second (referred to as the data rate). The data rate published is significantly reduced at the initial stages due to the flow control mechanisms of the receiver. It stabilizes at around 25 MB per second.


With MySQL RDBMS event store

DAS data persistence was measured by publishing to 2 load balanced receiver nodes with MySQL database.

SampleNumber of EventsMean Event Rate

Smart Home sample

1000000005741 events per second

Wikipedia sample

159011274438 events per second


Analyzer Performance

The following topics describe the analyzer performance of WSO2 DAS.

With MySQL RDBMS event store

Spark analyzing performance (time to complete execution) was measured using a 2 node DAS analyzer cluster with MySQL database.

Time taken for each type of Spark query is given below.

Data setEvent CountQuery TypeTime Taken (seconds)
Smart Home10000000INSERT OVERWRITE TABLE cityUsage SELECT metro_area, avg(power_reading) AS avg_usage, min(power_reading) AS min_usage, max(power_reading) AS max_usage FROM smartHomeData GROUP BY metro_area 26 sec
Smart Home10000000INSERT OVERWRITE TABLE peakDeviceUsageRange SELECT house_id, (max(power_reading) - min(power_reading)) AS usage_range FROM smartHomeData WHERE is_peak = true AND metro_area = "Seattle" GROUP BY house_id 22 sec
Smart Home10000000INSERT OVERWRITE TABLE stateAvgUsage SELECT state, avg(power_reading) AS state_avg_usage FROM smartHomeData21 sec
Smart Home10000000INSERT OVERWRITE TABLE stateUsageDifference SELECT a2.state, (a2.state_avg_usage-a1.overall_avg) AS avg_usage_difference FROM (select avg(state_avg_usage) as overall_avg from stateAvgUsage) as a1 join stateAvgUsage as a2  1 sec
Wikipedia10000000INSERT INTO TABLE wikiAvgArticleLength SELECT AVG(length) as avg_article_length FROM wiki48 min
Wikipedia10000000INSERT INTO TABLE wikiContributorSummary SELECT contributor_username, COUNT(*) as page_count FROM wiki GROUP BY contributor_username1 hour 45 min
Wikipedia10000000INSERT INTO TABLE wikiTotalArticleLength SELECT SUM(length) as total_article_chars FROM wiki44 min
Wikipedia10000000INSERT INTO TABLE wikiTotalArticlePages SELECT COUNT(*) as total_pages FROM wiki1 hour 17 min



Single Node Local Clustered Setup Statistics

...

The setup is a 2 node analyzer cluster with a MySQL database as the event store. Analyzer statistics (i.e., the time duration for each execution of the query) are given below.

DatasetEvent CountQuery TypeTime Taken
Wikipedia15901127INSERT INTO TABLE wikiContributorSummary SELECT contributor_username, COUNT(*) as page_count FROM wiki GROUP BY contributor_username25 min
Wikipedia15901127INSERT INTO TABLE wikiTotalArticleLength SELECT SUM(length) as total_article_chars FROM wiki25 min
Wikipedia15901127INSERT INTO TABLE wikiTotalArticlePages SELECT COUNT(*) as total_pages FROM wiki25 min
Wikipedia15901127INSERT INTO TABLE wikiAvgArticleLength SELECT AVG(length) as avg_article_length FROM wiki25 min

It was observed that the performance here is comparatively higher (taking into account that the setup consists of a single machine). This is mainly due to the DAS server and MySQL existing locally, and having no physical network I/O delays as a result. This allows the queries to be executed in an optimal manner.

...

 In the following table, the shardIndexRecordBatchSize indicates the amount of index data (in bytes) to be processed at a time by a shard index worker. 

ModeDatasetshardIndexRecordBatchSizeReplication FactorEvent CountTime Taken (seconds)Average TPS
StandaloneWikipedia10MBNA1590112779751993.871724
StandaloneWikipedia20MBNA1590112767652350.499187
StandaloneSmart Home20MBNA20000000138514440.43321
Minimum Fully DistributedWikipedia20MB11590112768702314.574527
Minimum Fully DistributedWikipedia20MB01590112772802184.220742