This guide provides a quick introduction to using WSO2 Data Analytics Server (DAS).
About WSO2 DAS
WSO2 Data Analytics Server 3.0.0 introduces a single solution with the ability to build systems and applications that collect and analyze both realtime and persisted, data and communicate the results. It combines real-time, batch, interactive, and predictive (via machine learning) analysis of data into one integrated platform to support the multiple demands of Internet of Things (IoT) solutions, as well as mobile and Web apps. It is designed to analyse millions of events per second, and is therefore capable to handle large volumes of data in Big Data and Internet of Things projects. WSO2 DAS workflow consists of three main phases as illustrated in the diagram below.
About this guide
This introductory guide demonstrates the steps required to get a simple scenario working based on the workflow of WSO2 DAS . In this guide, a collection of events (a CSV file) which contains a set of records collected from smart plug sensors in households is used to publish (simulate) events to WSO2 DAS for data collection. Thereby, this guide demonstrates calculating plug usage per household on the collected data for batch analytics, calculating the average, minimum, and maximum values for the data inflow for realtime analytics, and performing an ad hoc Apache Lucene query on the data for interactive analytics. Further, it communicates the visualization of results through dashboards.
Getting started
Set up the following prerequisites before you begin.
- Set up the appropriate general prerequisite applications before you start. For information on the general prerequisites, see Installation Prerequisites.
- Download WSO2 Data Analytics Server. For instructions, see Downloading the Product.
- Install the product by setting the
JAVA_HOME
environment variable and other system properties. For instructions, see Installing the Product. Start the DAS by navigating to
<DAS_HOME>/bin/
using the command-line, and executingwso2server.bat
(for Windows) orwso2server.sh
(for Linux). For instructions, see Running the Product.
Deploying the sample C-App
You can deploy artifacts (i.e. event streams, event receivers, Spark scripts, event publishers, and dashboards etc.) as composite Carbon Applications (C-Apps) in WSO2 DAS. This guide uses the SMART_HOME.car
file as the toolbox which contains all the artifacts required for this guide in a single package. For more information on C-Apps, see Carbon Application Deployment for DAS. Follow the steps below to deploy and use a sample C-App in WSO2 DAS.
- Log in to the DAS management console using the following URL: https://<DAS_HOST>:<DAS_PORT>/carbon/
- Click Main , and then click Add in the Carbon Applications menu.
- Click Choose File, and upload the
<DAS_HOME>/samples/capps/Smart_Home.car
file as shown below.
- Click Main , then click Carbon Applications, and then click List view, to see the uploaded Carbon application as shown below.
You can use the Event Flow feature of WSO2 DAS to visualize how the components that you created above are connected with each other. Also, you can use it for verification purposes i.e. to validate the flow of the events within the DAS as shown below.
Publishing events
Once you develop the complete Event Flow, you can test the flow by publishing the events to the DAS. There are several methods of publishing to DAS. This guide uses the event simulation method using the Event Simulator.
Event Simulator is a tool that can be used for monitoring and debugging event streams. You can use this tool to simulate events by creating event(s) with values assigned to event stream attributes. Follow the steps below to perform event simulation to publish data to WSO2 DAS.
- Log in to the DAS Management Console, if you are not already logged in.
Click Tools, and then click Event Simulator.
Download the sample.csv file which contains a set of events records collected from household ‘smart plug’ sensors.
In the Send multiple events option, click Choose File.
Select the
sample.csv
file which you downloaded, and click Upload.Click OK in the pop up message which indicates successful uploading of the CSV file, and refresh the page to view the uploaded file which is displayed as shown below.
Click Configure, and enter the details as shown below.
Select
SMARTHOME_DATA:1.0.0
for Select the target event stream, and type a comma in the provided text field for Field delimeter.
Parameter Value File Name sample.csv Select the target event stream SMARTHOME_DATA:1.0.0 Field delimiter , Delay between events in milliseconds 1000 Click Configure, and then click OK in the message which pops up. As a result of this configuration, the Play button appears for the
sample.csv
file in the Event Stream Simulator page as shown below.
Click Play to start simulating the events in the uploaded file to publish events as shown below.
Viewing the output
Follow the steps below to view the presentation of the output in the Analytics Dashboard.
Log in to the Management console, if you are not already logged in.
Click Main , and then click Analytics Dashboard in the Dashboard menu.
Log in to the Analytics Dashboard, using
admin/admin
credentials.Click the DASHBOARDS button in the top menu. You view the dashboard deployed by the C-App as shown below.
Click the View button of the corresponding Dashboard.
Click the Home button in the top menu of the Smart Home dashboard. You view the usage statistics output of the processed events in a line chart as shown below.
Click the Plug_Usage in the top menu of the Power_Dashboard. You view the plug usage data output of the processed events in a bar chart as shown below.
Follow the steps below to undeploy the C-App, which you already uploaded in this section before proceeding to the next sections.
- Log in to the DAS Management Console using
admin/admin
credentials, if you are not already logged in. - Click Main, then click Carbon Applications, and then click List view, to see the uploaded Carbon application.
- Click on the Delete option to delete the Carbon application as shown below.
- Refresh the Web browser screen, and check if the
SMART_HOME.car
file has beed removed from the list of all available C-Apps.
WSO2 DAS basics
The following sections provide detailed instructions on the main functionalities of WSO2 DAS which you performed above using the C-App.
You can perform data collection in WSO2 DAS as described in the next Collecting data section below.
Collecting data
The first step of WSO2 DAS workflow is to collect data. In the data collection process, first you need to create the event stream definition. An e vent is a unit of data collection, and an e vent stream is a sequence of events of a particular type which consists of a set of unique attributes.
WSO2 DAS exposes a single API for external data sources to publish data events to it, and provides configurable options to either process the data event stream inflow (in memory) for realtime analytics, persist (in data storage) for batch analytics, and index for interactive analytics.
Creating the event stream
The first step in collecting data is defining an event stream by creating the event stream definition. The defined stream provides the structure required to process the events. For more information on event streams, see Event Streams . Follow the steps below to create the event stream.
- Log in to the DAS Management Console using the following URL and
admin/admin
credentials: https://10.100.5.72:9443/carbon/ - Click Main, and then click Streams.
Click Add Event Stream, and enter the details as shown below.
Event Stream Details
Parameter Value Event Stream Name SMARTHOME_DATA Event Stream Version 1.0.0 Payload Data Attributes
Click Add to add the attribute after entering the attribute name and attribute type.
Attribute Attribute Type id string value float property bool plug_id int household_id int house_id int - Click Add Event Stream. The new event stream is added to the list of all available event streams as shown below.
Persisting the event stream
Events received by the DAS can be processed either in realtime and/or in batch mode. You need to persist the event information, to process the events in batch mode. However, if you process the events in realtime you do not need to persist them. For persisted events, configurable options are provided to index the data.
WSO2 DAS introduces a pluggable architecture which allows you to persist data events into any Relational Data Storage (Oracle, MSSQL, MySQL etc.) or NoSQL storages (Apache HBase and Apache Cassandra). Multi data event storage is also possible. F or an example, events can be stored in a NoSQL storage while the processed data events can be stored in a relational data storage.
Follow the steps below to persist received events.
For persisted events, configurable options are provided to index the data which is required for Interactive Analytics later in the example.
Log in to the DAS Management Console, if you are not already logged in.- Click Main, and then click Streams.
- Click Edit option of the corresponding event stream as shown below.
- Click Next[Persist Event].
- Select the Persist Event Stream check box. As a result, the Persist Attribute check boxes are selected for all the attributes. Then select the Index Column checkbox for the
house_id
attribute as shown below.
- Click Save Event Stream.
- Click Yes in the pop-up message as shown below.
- You view the persisted event stream added to the list of all available event streams as shown below.
Creating the event receiver
Once you define the event stream and configure how it should be used, you need to create event receivers to connect WSO2 DAS with different data sources.
WSO2 DAS supports event retrieval from many transport protocols and different formats. For information on supported transport protocols and type formats, see Configuring Event Receivers. Follow the steps below to create an event receiver of the WSO2Event
type for this guide.
WSO2Event event receiver is used to receive events in the WSO2Event format via Thrift or binary protocols. For more information, see WSO2Event Event Receiver.
- Log in to the DAS Management Console, if you are not already logged in.
- Click Main, and then click Receivers.
Click Add Event Receiver, and enter the details as shown below.
Parameter Value Event Receiver Name DATA_RECEIVER Input Event Adapter Type wso2event Event Stream SMARTHOME_DATA:1.0.0 Message Format wso2event
- Click Add Event Receiver. You view the new event receiver added to the list of all available event receivers as shown below.
Creating another event stream
The SMARTHOME_DATA
event stream you have already created serves as the input stream in this scenario. Events of this stream need to be forwarded to another stream once they are processed in order to be published. Follow the steps below to add the output event stream to which the processed data is forwarded.
- Log in to the DAS Management Console if you are not already logged in.
- Click Main, and then click Streams.
- Click Add Event Stream , and enter the details as shown below.
Event Stream Details
Parameter Value Event Stream Name usageStream Event Stream Version 1.0.0 Payload Data Attributes
Click Add to add the attribute after entering the attribute name and attribute type.
Attribute Attribute Type house_id int maxVal float minVal float avgVal double currentTime string Click Next[Persist Event].
Select the Persist Event Stream check box. As a result, the Persist Attribute check box is selected for all the attributes as shown below.
- Click Add Event Stream . The new event stream is added to the list of all available event streams as shown below.
Creating an event publisher
Once the events are processed, events publishers are used to publish results to external systems for taking further actions. Event publishers provide the capability to send event notifications and alerts from WSO2 DAS to external systems. Follow the steps below to create a new event publisher.
- Log in to the DAS Management Console, if you are not already logged in.
Click Main, and then click Publishers in the Event menu.
Click Add Event Publisher, and enter the following details as shown below.
Parameter Value Event Publisher Name DATA_PUBLISHER Event Source usageStream:1.0.0 Output Event Adapter Type logger Message Format text Click Add Event Publisher. You view the new event publisher added to the list of all event publishers as shown below.
Since you created a logger type event publisher, the output is written to the CLI.
Publishing events
Event Simulator is a tool which you can use for publishing events to event streams. You need to create event(s) by assigning values to event stream attributes to simulate them. For more information, see Publishing Data Using Event Simulation. Follow the steps below to perform event simulation.
- Log in to the DAS Management Console, if you are not already logged in.
Click Tools, and then click Event Simulator.
Download the sample.csv file which contains a set of events records collected from household ‘smart plug’ sensors.
If you have the
sample.csv
file already uploaded you can skip step 4 to 6 below.In the Send multiple events option, click Choose File.
Select the
sample.csv
file which you downloaded, and click Upload.Click OK in the pop up message which indicates successful uploading of the CSV file, and refresh the page to view the uploaded file which is displayed as shown below.
Click Configure, and enter the details as shown below.
Select
SMARTHOME_DATA:1.0.0
for Select the target event stream, and type a comma in the provided text field for Field delimeter.
Parameter Value File Name sample.csv Select the target event stream SMARTHOME_DATA:1.0.0 Field delimiter , Delay between events in milliseconds 1000 Click Configure, and then click OK in the message which pops up.
Play to start simulating the events in the uploaded file to publish events as shown below.
You can analyse data received by WSO2 DAS as described in the next Analyzing data section below.
Analysing data
You can configure any data event stream received by WSO2 DAS to perform batch, real time, and/or interactive analytics as described in the below sections. The first section demonstrates how to perform batch analytics.
Batch analytics
You can perform batch analytics when event streams are configured to be persisted for later batch processing scenarios such as data aggregation, summarization etc. WSO2 DAS batch analytics engine is powered by Apache Spark, which accesses the underlying data storage and executes programs to process the event data. The DAS provides an SQL-like query language to create the jobs through scripts that need to be executed. For more information, see Data Analysis. You can perform batch analytics either using Spark analytics scripts or using the Spark Console as described below.
You need to follow instructions in the Collecting data section (i.e. create the event stream, persist it, create the event receiver, and publish events to WSO2 DAS), before performing the following batch analytics operations.
Using the analytics script
Follow the steps below to create the analytics script.
- Log in to the DAS Management Console, if you are not already logged in.
- Click Main, and then click Scripts in the Batch Analytics menu.
- Click Add New Analytics Script.
Enter
BATCH_ANALYTICS_SCRIPT
in the Script Name parameter. Enter the following Spark SQL script in the Spark SQL Queries parameter as shown below.CREATE TEMPORARY TABLE homeData USING CarbonAnalytics OPTIONS (tableName "SMARTHOME_DATA", schema "id STRING, value FLOAT, property BOOLEAN, plug_id INT, household_id INT, house_id INT"); create temporary table plugUsage using CarbonAnalytics options (tableName "plug_usage", schema "house_id INT, household_id INT, plug_id INT, usage FLOAT -sp"); insert overwrite table plugUsage select house_id, household_id, plug_id, max(value) - min (value) as usage from homeData where property = false group by house_id, household_id, plug_id ; Select * from plugUsage where usage>300
The above script does the following:
- Loads data from the DAS Data Access Layer (DAL), and registers temporary tables in the Spark environment.
- Performs batch processing by calculating the usage value (through the difference of the max and min values), grouped by
house_id
,household_id
, andplug_id
from data of the temporaryhomeData
table. - Writes back to a new DAL table named
plug_usage
. - Executes the following query:
Select * from plugUsage where usage>30
- Click Execute Script, to check the validity of the script as shown below. You view the results as shown below.
- Click Add. You view the new script added to the list of all available scripts as shown below.
Using the Spark Console
You can obtain faster results by executing adhoc queries on the indexed attributes through an interactive Web console, named as the named as the Spark Console. Follow the steps below to perform a batch analytics operation using the Spark Console.
- Log in to the DAS Management Console, if you are not already logged in.
- Click Main, and then click Console in the Batch Analytics menu.
Enter the following Spark SQL query in the console, and press Enter key.
Select * from plugUsage where usage>100
You view the output as shown below.
You can perform realtime analytics in WSO2 DAS as described in the next Realtime analytics section below .
Realtime analytics
The realtime analytics engine uses a set of specified queries or rules through a SQL-like Siddhi Query Language defined in an execution plan, to process multiple event streams in realtime. An execution plan consists of a set of queries and import and export streams. It is the store of the event logic that is bound to an instance of the server runtime, and acts as the editor for defining the event processing logic. For more information see, Working with Execution Plans . You can perform for realtime analytics using the same event stream which you used before to perform batch analytics by proceeding the event streams inflow through the WSO2 real time analytics engine as described below.
You need to follow instructions in the Collecting data section (i.e. create the event stream, persist it, create the event receiver, create the event publisher, and publish events to WSO2 DAS), before performing the following realtime analytics operations.
Creating the execution plan
Follow the steps below to create an execution plan.
- Log in to the DAS Management Console, if you are not already logged in.
- Click Main, and then click Execution Plans in the Streaming Analytics menu.
Click Add Execution Plan, and enter the following details as shown below.
In the below execution plan, the realtime engine (Siddhi) collects 10 incoming data events within a one minute time window from the incoming filtered stream, calculates the average, maximum and minimum values in realtime grouped by
house_id
, and sends the processed event to another newly defined stream.
- Select
SMARTHOME_DATA:1.0.0
for Import Stream, enterinputStream
for As, and click Import. Add the following query at the end of the provided space.
from inputStream[value>0]#window.time(1 min) select house_id,max(value) as maxVal,min(value) as minVal, avg(value) as avgVal,time:currentTime() as currentTime group by house_id insert current events into usageStream ;
- Select
- Select
usageStream 1.0.0
for StreamID of Export Stream, and click Export as shown below.
- Click Validate Query Expressions, to validate the execution plan.
- Click Add Execution Plan. You view the new execution plan added to the list of all available execution plans as shown below.
Publish events to WSO2 DAS by simulating events. For instructions, see Publishing events.
View the output logged with the published events in the CLI on which you ran WSO2 DAS as shown below.
You can perform interactive analytcis in WSO2 DAS as described in the next Interactive analytics section below .
Interactive analytics
Interactive analytics are used for retrieval of fast results through ad hoc querying of a received and processed data. It is possible in the DAS, when you select to index event stream attributes. You can obtain faster results by executing ad hoc queries on the indexed attributes through the provided Data Explorer.
You need to follow instructions in the Collecting data section (i.e. create the event stream, persist it, create the event receiver, and publish events to WSO2 DAS), before performing the following interactive analytics operations. The house_id field which you indexed in the step 5 of data collection is used to search data that match the specified query.
Using the Data Explorer
The Data Explorer is the Web console for searching analytical data. Primary key, data range, facet search options are available for simple analytical record searches. It is also possible to search records by providing Lucene queries for advanced searches. Follow the steps below to perform an interactive analytics operation using the Data Explorer.
- Log in to the DAS Management Console, if you are not already logged in.
- Click Main, and then click Data Explorer in the Interactive Analytics menu.
- Select
SMARTHOME_DATA
for the Table Name parameter. - Select By Query option and enter
house_id:39
in the search field as shown below.
- Click Search. You view the output as shown below.
After performing analytics, you can communicate results in WSO2 DAS as described in the next Communicating results section below .
Communicating results
The final step in the event flow is to visualize the data. WSO2 DAS uses several presentation mechanisms to present event notifications and processing results. Thereby, it provides the Analytics Dashboard to visualize the processed data for decision making.
The Analytics Dashboard is used to create customizable dashboards for analytics data visualization. Dashboard creation is wizard driven, where you can use gadgets/widgets such as Line, Bar, Arc charts to get data from analytical tables and add them on a structured grid layout to provide an overall view on the analyses. For more information, see Presenting Data .
Using the Analytics Dashboard
WSO2 DAS provides an Analytics Dashboard for creating customizable dashboards for visualization of analytical data. Follow the steps below to present data using the Analytics Dashboard.
- Log in to the DAS Management Console, if you are not already logged in.
- Click Main, and then click Analytics Dashboard in the Dashboard menu.
Creating a Dashboard
You can create a new Dashboard to present the data of the above analytics as shown in the example below. Follow the steps below to create a new Dashboard in the Analytics Dashboard. For more information, see Adding a Dashboard.
Click the following CREATE DASHBOARD button in the top navigational bar to create a new dashboard.
Enter a Title and a Description for the new dashboard as shown below, and click Next.
Select a layout to place its components as shown below.
Click Select. You view a layout editor with the chosen layout blocks marked using dashed lines as shown below. Now the dashboard is persisted to the disk.
Click the following icon in the top menu.
Click Dashboards to view the new dashboard added to the list of all available dashboards as shown below.
Creating a Gadget
Follow the steps below to create a Bar Chart Gadget to visualize the analyzed data by selecting the event stream created above (i.e. UsageStream
) as the data source, and add it to the Power_Dashboard
Dashboard you created above. For more information, see Adding Gadgets to a Dashboard.
- Log in to the Analytics Dashboard, if you are not already logged in.
Click the following CREATE GADGET icon in the top menu bar.
Select the
UsageStream
as the input data source as shown below.Select Chart Type and enter the preferred x, y axis and additional parameters based on the selected chart type as shown below.
Click Add to Gadget Store to generate a gadget with the information you provided.
Click the corresponding Design button of the dashboard to which you want to add a gadget as shown below.
Click the following gadget browser icon in the side menu bar.
You view the new gadget listed in the gadget browser. If not, search for it using its name.
Click on the new gadget, drag it out, and place it in the preferred grid of the selected layout in the dashboard editor as shown below.
Where to go next
This is your first experience of learning about DAS and trying out its functionalities.
- For more information on the features and architecture of WSO2 DAS, see About DAS.
- For more information on how to download, install, run and get started with WSO2 DAS, see Getting Started.
- For more information on the main functionalities of WSO2 DAS, see User Guide.
- For more information on various product deployment scenarios and other topics useful for system administrators, see Admin Guide.
- For more information on several business use case samples of WSO2 DAS, see Samples.