Event Ingestion with Persistance

HBase Event Store

This test involved setting up a 10-node HBase cluster with HDFS as the undrelying file system.

...

Scenario: Persisting 1 billion events from the Smart Home DAS Sample

This This test was designed to test the data layer during sustained event publication. During testing, the TPS was around the 150K mark, and the memstore flush of the HBase cluster (which suspends all writes) and minor compaction operations brought it down in bursts. Overall, a mean of 96K TPS was achieved, but a steady rate of around 100-150K TPS as is achievable, as opposed to the current no-flow-control situation.

...

Events	16753779
Time (s)	1862.901
Mean TPS	8993.381291

MSSQL Event Store

Infrastructure used

c4.2xlarge Amazon EC2 instances as the DAS nodes
db.m4.2xlarge Amazon RDS instance as the database node
Customized Thrift client as the data publisher (Thrift producer found in samples)

Scenario: Persisting 30 million events of Process Monitoring Events on MSSQL

Receiver node Data Persistence Performance

This test involved persisting

Image Added

MySQL Event Store

Infrastructure used

c4.2xlarge Amazon EC2 instances as the DAS nodes
One DAS node as the publisher
A db.m4.2xlarge MSSQL 2xlarge Amazon RDS instance as the database node

Receiver node Data Persistence Performance

Image Removed

...

Customized Thrift client as the data publisher (Thrift producer found in samples)

Scenario: Persisting 12 million events of Process Monitoring Events on MSSQL

Batch Analytics

Scenario: Running Spark queries on the 1 billion published events

Spark queries from the Smart Home DAS sample were executed against the published data, and the analyzer node count was kept at 2 and 3 respectively for 2 separate tests. The SPARK JVMs were provided with following during the test.

...

Query	2 Analyzers		3 Analyzers
Query	Time(s)	Mean TPS	Time(s)	Mean TPS
`INSERT OVERWRITE TABLE cityUsage SELECT metro_area, avg(power_reading) AS avg_usage,min(power_reading) AS min_usage, max(power_reading) AS max_usage FROM smartHomeData GROUP BY metro_area`	958.80	1042968.20	741.15	1349250.90
`INSERT OVERWRITE TABLE ct SELECT count(*) FROM smartHomeData`	953.46	1048806.20	734.99	1360570.13
`INSERT OVERWRITE TABLE peakDeviceUsageRange SELECT house_id, (max(power_reading) - min(power_reading)) AS usage_range FROM smartHomeData WHERE is_peak = true AND metro_area = "Seattle" GROUP BY house_id`	975.06	1025581.77	751.27	1331073.47
`INSERT OVERWRITE TABLE stateAvgUsage SELECT state, avg(power_reading) AS state_avg_usage FROM smartHomeData GROUP BY state`	991.08	1009003.34	783.54	1276265.545

Scenario: Running Spark queries on the Wikipedia corpus

Query	2 Analyzers		3 Analyzers
Query	Time(s)	Mean TPS	Time(s)	Mean TPS
`INSERT INTO TABLE wikiAvgArticleLength SELECT AVG(length) as avg_article_length FROM wiki`	222.70	75234.03	167.27	100164.18
`INSERT INTO TABLE wikiTotalArticleLength SELECT SUM(length) as total_article_chars FROM wiki`	221.74	75554.76	166.92	100373.80
`INSERT INTO TABLE wikiTotalArticlePages SELECT COUNT(*) as total_pages FROM wiki`	221.80	75536.05	166.14	100842.18
`INSERT INTO TABLE wikiContributorSummary SELECT contributor_username, COUNT(*) as page_count FROM wiki GROUP BY contributor_username`	236.11	70958.52	181.42	92350.26

...

DAS Performance Test Round 3: RDBMS (MySQL)

Infrastructure used

...

Receiver node Data Persistence Performance

...

Versions Compared

Old Version 26

New Version 27

Key

Event Ingestion with Persistance

HBase Event Store

Scenario: Persisting 1 billion events from the Smart Home DAS Sample

MSSQL Event Store

Scenario: Persisting 30 million events of Process Monitoring Events on MSSQL

MySQL Event Store

Scenario: Persisting 12 million events of Process Monitoring Events on MSSQL

Batch Analytics

Scenario: Running Spark queries on the 1 billion published events

Scenario: Running Spark queries on the Wikipedia corpus

Page Comparison

Versions Compared

Old Version 26

New Version 27

Key

Event Ingestion with Persistance

HBase Event Store

Scenario: Persisting 1 billion events from the Smart Home DAS Sample

MSSQL Event Store

Scenario: Persisting 30 million events of Process Monitoring Events on MSSQL

MySQL Event Store

Scenario: Persisting 12 million events of Process Monitoring Events on MSSQL

Batch Analytics

Scenario: Running Spark queries on the 1 billion published events

Scenario: Running Spark queries on the Wikipedia corpus