Introduction
In the previous tutorials, we looked at the core Siddhi functionalities including event ingestion, publishing and many forms of processing such as preprocessing, correlation, KPI analysis, trend analysis etc.
In this tutorial, let's move on to the Machine Learning capabilities offered by WSO2 Stream Processor including real-time prediction, online machine learning, anomaly detection etc. Let's look at how static real-time predictions can be made via WSO2 SP using PMML serializations.
The factory foreman of the Sweet Factory needs to predict the nature of next shipment of sugar syrup based on the shipments he has received so far. A predictive solution has been trained with inputs, temperature and density of the latest shipment. Using these, a prediction is required on whether the shipment received meets his requirements before it is dispatched to the factory. We can also assume that this pre-trained model is exported in PMML serialization and that it is available in the system. To build and train a model, you can use this PMML sample.
Tutorial steps
Let's get started!
Let's add an input event stream definition to capture the events generated by the supplier before shipping the sugar syrup to the sweet factory.
define stream SugarSyrupDataStream (temperature double, density double);
Now let's define the output stream. To include a prediction on whether the shipment will be acceptable or not in the output, this definition must include an attribute for the prediction as shown below.
define stream PredictedSugarSyrupDataStream (nextTemperature double, nextDensity double, decision boolean);
As you have learnt from previous tutorials, a reading from an input stream looks similar to the following.
from SugarSyrupDataStream
For this scenario, you need to update it as follows.
To enable the PMML extension, you need to add the
#pmml:predict()
annotation as shown below.from SugarSyrupDataStream#pmml:predict()
To access the pre-trained PMML model via which the predictions are made, specify the path as follows.
from SugarSyrupDataStream#pmml:predict( "/home/user/decision-tree.pmml" )
Let's also add the attributes that are needed by the model for prediction.
from SugarSyrupDataStream#pmml:predict("/home/user/decision-tree.pmml", temperature, density)
Based on the model definition, the output attributes can differ. Here, you have defined the model so that it can return a prediction on whether the shipment can be accepted, based on the given temperature and density.
Let's route this output to the output stream as shown below.
from SugarSyrupDataStream#pmml:predict("/home/user/decision-tree.pmml", temperature, density) select * insert into PredictedSugarSyrupDataStream;
The completed Siddhi application is as follows.
@App:name('SugerSyrupPredictionApp') @source(type='http', receiver.url='http://localhost:5006/SugarSyrupEP', @map(type = 'json')) define stream SugarSyrupDataStream (temperature double, density double); @sink(type='log', prefix='Predicted next sugar syrup shipment:') define stream PredictedSugarSyrupDataStream (nextTemperature double, nextDensity double, decision boolean); from SugarSyrupDataStream#pmml:predict("/home/user/decision-tree.pmml", temperature, density) select * insert into PredictedSugarSyrupDataStream;