Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Over 1 million TPS on Spark for 2 analyzers
  • About 1.3 million TPS for 3 analysers.

 

DAS read operations from the HBase cluster also leverage HBase data locality, which would have made the read process more efficient compared to random reads.

 The DAS GET operations (on HBase) make use of the HBase data locality aspect. This has the potential to perform the GET operations fast compared to random access.

Query2 Analyzers3 Analyzers
Time(s)Mean TPSTime(s)Mean TPS
INSERT OVERWRITE TABLE cityUsage SELECT metro_area, avg(power_reading) AS avg_usage,min(power_reading) AS min_usage, max(power_reading) AS max_usage FROM smartHomeData GROUP BY metro_area958.801042968.20741.151349250.90
INSERT OVERWRITE TABLE ct SELECT count(*) FROM smartHomeData953.461048806.20734.99

1360570.13

INSERT OVERWRITE TABLE peakDeviceUsageRange SELECT house_id, (max(power_reading) - min(power_reading)) AS usage_range FROM smartHomeData WHERE is_peak = true AND metro_area = "Seattle" GROUP BY house_id

975.06

1025581.77751.271331073.47
 
INSERT OVERWRITE TABLE stateAvgUsage SELECT state, avg(power_reading) AS state_avg_usage FROM smartHomeData GROUP BY state
991.081009003.34783.54

1276265.545

...