Page Comparison

...

Over 1 million TPS on Spark for 2 analyzers
About 1.3 million TPS for 3 analysers.

DAS read operations from the HBase cluster also leverage HBase data locality, which would have made the read process more efficient compared to random reads.

The DAS GET operations (on HBase) make use of the HBase data locality aspect. This has the potential to perform the GET operations fast compared to random access.

Query	2 Analyzers		3 Analyzers
Query	Time(s)	Mean TPS	Time(s)	Mean TPS
`INSERT OVERWRITE TABLE cityUsage SELECT metro_area, avg(power_reading) AS avg_usage,min(power_reading) AS min_usage, max(power_reading) AS max_usage FROM smartHomeData GROUP BY metro_area`	958.80	1042968.20	741.15	1349250.90
`INSERT OVERWRITE TABLE ct SELECT count(*) FROM smartHomeData`	953.46	1048806.20	734.99	1360570.13
`INSERT OVERWRITE TABLE peakDeviceUsageRange SELECT house_id, (max(power_reading) - min(power_reading)) AS usage_range FROM smartHomeData WHERE is_peak = true AND metro_area = "Seattle" GROUP BY house_id`	975.06	1025581.77	751.27	1331073.47
`INSERT OVERWRITE TABLE stateAvgUsage SELECT state, avg(power_reading) AS state_avg_usage FROM smartHomeData GROUP BY state`	991.08	1009003.34	783.54	1276265.545

...

Versions Compared

Old Version 7

New Version 8

Key