Apache Spark is a powerful open source processing engine built around speed, ease of use, and sophisticated analytics. WSO2 DAS employs Apache Spark as its analytics engine. Further, WSO2 DAS 3.0.0 extends the latest Spark API (version 1.2.1) to come up with its data analytics processor replacing Apache Hadoop. The ecosystem of Apache Spark is as follows.
For more information on Apache Spark, see Apache Spark documentation.
When you set up your production environment, it is recommended to restrict access to Spark UIs for security purposes. This is achieved by including the 4040 and 8081 ports to the block list when coniguring firewalls.
Following sections describe how you can perform batch analytics using Apache Spark SQL in WSO2 DAS.