Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents

Introduction

This sample demonstrates how to create an event table and use itrun Linear Regression using the Timeseries Toolbox. This sample uses Event simulator for inputs and the logger publisher for logging the outputs to the CEP console.

The data used for the regression is from a baseball stats dataset. The dependent variable (predictor variable) is the salary of the Baseball player based on his performance statistics which are the independent variables – rbi, walks, strikeouts and errors.

The execution plan used in this sample is as follows:

from orderStream unidirectional join pizzaOrdersTable on pizzaOrdersTable.customerName == orderStream.customerName select orderStream.customerName as customerName, pizzaOrdersTable.noOfOrders as noOfPreviousOrders, pizzaOrdersTable.lastOrderedTime as lastOrderedTime insert into previousOrders;

from previousOrders select noOfPreviousOrders as noOfOrders, customerName update pizzaOrdersTable on pizzaOrdersTable.customerName == customerName;

The first query,

  • Receives events through the orderStream.

  • Take the previous order details of the customer.

  • Insert the data into previousOrders table.

The second query,

  • Update the pizzaOrdersTable according to the previousOrders table.

Prerequisites

See Prerequisites in CEP Samples Setup page

 

Code Block
from baseballData#timeseries:regress(2, 10000, 0.95, salary, rbi, walks, strikeouts, errors)
select *
insert into regResults;

 

The inputs to the regression function are as follows.

  • Calculation Interval – 2

  • Batch size – 10,000

  • Confidence Interval – 0.95

  • Y (dependent) variable – salary

  • X (independent) variables – rbi, walks, strikeouts, errors

The output of the query will be the coefficients of the regression equation for the accumulated dataset at each 2nd event. The output attributes will include the input variable values, beta coefficients for each X variable, beta zero and the standard error.

For more detail on input and output parameters of regression please refer https://docs.wso2.com/display/CEP400/Regression  

Prerequisites

See Prerequisites in CEP Samples Setup page.

Building the sample

Start the WSO2 CEP server with the sample configuration numbered 0116. For instructions, see see Starting sample CEP configurations. This sample configuration does the following:

  •  Points the default Axis2 repo to to <CEP_HOME>/sample/artifacts/0116 (by default, the Axis2 repo is is <CEP_HOME>/repository/deployment/server).

Executing the sample

  1. Open another terminal, go to <CEP_HOME>/samples/producers/http and run the following command:

  2. ant -Durl=http:Log into the CEP management console which is located at https://localhost:9763/endpoints/buildStatisticsEventReceiver -Dsn=0116

  3. It builds the http client and publishes the events at  <CEP_HOME>/samples/artifacts/0116/pizzaOrderEvents.txt to the pizzaOrderEventReceiver http endpoint.

  4. You can see the events getting received by CEP by the logs in its console9443/carbon.

  5. Go to Tools -> Event Simulator. Under the 'Multiple Events' section, you can see the listed ‘BaseballData.csv' file which contains the sample data. Click 'play' to start sending sample events from the file.

  6. See the output events received from the CEP console. This sample uses the logger adaptor to log output events to the console.

For example, given below is a screenshot of the final regression output for this data.

Image Added