Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Siddhi enables users to perform linear regression on real time, data streams. The regress function takes in a dependent event stream (Y), any number of independent event streams (X1, X2,...Xn) and returns all coefficients of the regression equation Image Added

Input Parameters

Parameter

 Required / Optional

Description

Calculation Interval

Optional

The frequency of regression calculation.

Default value: 1 (i.e. at every event)

Batch Size

Optional

The maximum number of events used for a regression calculation

Default value: 1,000,000,000 events

Confidence Interval

Optional

Confidence Interval to be used for regression calculation

Default value: 0.95

Y Stream

Required

Data stream of the dependent variable

X Stream(s)

Required

Data stream(s) of the independent variable

 

Output Parameters

Parameter

Name

Description

Standard Error

stdError

Standard Error of the Regression Equation

β coefficients

beta0, beta1, beta2 etc;

n+1 β coefficients where n is the number of x parameters

Input Stream Data

Name given in the input stream

All attributes sent in the input stream

The regress function will nullify any β coefficients that fail the T-test based on the confidence interval.  The user can access any of the output parameters using the ‘Name’ of the parameter given above.

Examples

The following query submits a calculation interval (every 10 events), a batch size (100,000 events), a confidence interval (0.95), a dependent input stream (Y) and 3 independent input streams (X1, X2, X3) that will be used to perform linear regression between Y and all X streams.

from StockExchangeStream#transform.timeseries:regress(10, 100000, 0.95, Y, X1, X2, X3)

select *

insert into StockForecaster     

 

When executed, the above query will return the standard error of the regression equation (ε), 4 β coefficients (β0, β1, β2, β3) and all the items available in the input stream. Using these results, the user can build a relationship between Y and all Xs (regression equation) as follows Image Added