Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Following are the Siddhi extensions you can use in processing events using WSO2 CEPDAS.

Table of Contents
maxLevel3

Excerpt
hiddentrue
1.FORMAT FOR FUNCTIONS

functionName

<returnType r1> functionName   ( < paramType> p1) (FOR ONE RETURN TYPE & ONE PARAMETER TYPE)
< returnType | returnType functionName(<paramType|paramType> p1, <paramType|paramType> p2) (FOR ONE RETURN TYPE & MANY PARAMETER TYPES)
[< returnType | returnType r1, < returnType | returnType >  r2 functionName(<paramType|paramType> p1) (FOR MANY RETURN TYPES & ONE PARAMETER TYPE)
  • Extension TypeFunction | Aggregate  Function | Stream Function | Stream Processor | Window
  • Description
  • Parameter p1
  • Return  Parameter :  r1
  • Example:

functionName

<returnType> functionName   ( < paramType> p1)(FOR ONE RETURN TYPE & ONE PARAMETER TYPE)

<returnType|returnType>  functionName(<paramType|paramType> p1, <paramType|paramType> p2)(FOR ONE RETURN TYPE & MANY PARAMETER TYPES)

[< returnType | returnType r1, < returnType | returnType >  r2 functionName(<paramType|paramType> p1)(FOR MANY RETURN TYPES & ONE PARAMETER TYPE)

  • Extension Type: Function | Aggregate  Function | Stream Function | Aggregate Stream Function | Window
  • Description
  • Parameter p1
  • Return  Parameter :  r1
  • Example

...

Syntax<double> abs(<float|double> p1)
Extension TypeFunction
DescriptionReturns the absolute value of  p1 . This function wraps the java.lang.Math.abs() function.
Examples

Both the following queries return 3 since the absolute value of both 3 and -3 is 3.

  • abs(3)
  • abs(-3)

acos  

Syntax<double> acos(<float|double> p1)
Extension TypeFunction
DescriptionIf -1 <= p1 <= 1, this function returns the arc-cosine (inverse cosine) of p1. If not, it returns NULL. The return value is in radian scale. This function wraps the java.lang.Math.acos() function.
Exampleacos(0.5) returns 1.0471975511965979.

...

Syntax<double> asin (<float|double>  p1)
Extension TypeFunction
DescriptionIf -1 <= p1 <= 1, this function returns the arc-sin (inverse sine) of  p1 . If not, it returns NULL. The return value is in radian scale. This function wraps the java.lang.Math.asin() function.
Exampleasin(0.5) returns 0.5235987755982989.

atan 

Syntax<double> atan(<int|long|float|double> p1)
Extension TypeFunction
DescriptionReturns the arc-tangent (inverse tangent) of p1. The return value is in radian scale. This function wraps the java.lang.Math.atan() function.
Examplesatan(6d) returns 1.4056476493802699.

...

Syntax<double>  cbrt(<int|long|float|double> p1)
Extension TypeFunction
DescriptionReturns the cube-root of  p1  ( p1  is in radians). This function wraps the java.lang.Math.cbrt() function.
Examplecbrt(17d) returns 2.5712815906582356.

e 

Syntax<double>  e()
Extension TypeFunction
DescriptionReturns the java.lang.Math.E constant, which is the closest double value to e (which is the base of the natural logarithms).
Examplee() returns 2.7182818284590452354.

...

Syntax<boolean>  isInfinite(<float|double>  p1)
Extension TypeFunction
DescriptionThis function wraps the java.lang.Float.isInfinite() and java.lang.Double.isInfinite() functions that return true if  p1  is infinitely large in magnitude, or return false otherwise.
ExampleisInfinite(java.lang.Double.POSITIVE_INFINITY) returns true.

isNan  

Syntax< boolean>  isNan(<float|double>  p1)
Extension TypeFunction
DescriptionThis function wraps the java.lang.Float.isNaN() and java.lang.Double.isNaN() functions that return true if  p1  is a NaN (Not-a-Number) value, or return false otherwise.
ExampleisNan(java.lang.Math.log(-12d)) returns true.

...

Syntax<double> pi ( )
Extension TypeFunction
DescriptionReturns the java.lang.Math.PI constant, which is the closest value to pi (i.e. the ratio of the circumference of a circle to its diameter). 
Examplepi() always returns 3.141592653589793.

power 

Syntax<double> power ( < int|long|float|double>  value,  <int|long|float|double>   toPower )
Extension TypeFunction
DescriptionReturns value raised to the power of toPower.
Examplepower(5.6d, 3.0d) returns 175.61599999999996.

...

Syntax<double> sin (< int|long|float|double >  p1)
Extension TypeFunction
DescriptionReturns the sine of  p1  ( p1  is in radians). This function wraps the java.lang.Math.sin() function.
Examplesin(6d) returns -0.27941549819892586.

sinh 

Syntax<double> sinh (< int|long|float|double >  p1)
Extension TypeFunction
DescriptionReturns the hyperbolic sine of  p1  ( p1  is in radians). This function wraps the java.lang.Math.sinh() function. 
Examplesinh(6d) returns 201.71315737027922.

sqrt 

Syntax<double> sqrt (< int|long|float|double >  p1)
Extension TypeFunction
DescriptionReturns the square-root of  p1 . This function wraps the java.lang.Math.sqrt() function. 
Examplesqrt(4d) returns 2.

tan   

Syntax<double> tan (< int|long|float|double >  p1)
Extension TypeFunction
DescriptionReturns the tan of  p1  ( p1  is in radians). This function wraps the java.lang.Math.tan() function. 
Exampletan(6d) returns -0.29100619138474915.

...

Syntax<string> charAt(<string> str, <int>  index)
Extension TypeFunction
DescriptionReturns the char value as a string value at the specified index.
ExamplecharAt("WSO2", 1) returns 'S'.

coalesce 

Syntax< int|long|float|double|string|boolean > coalesce (< int|long|float|double|string|boolean >  arg1, < int|long|float|double|string|boolean >   arg2 ,.., < int|long|float|double|string|boolean >   argN )
Extension TypeFunction
DescriptionReturns the value of the first of its input parameters that is not null.
ParametersThis functions accepts any number of parameters. The parameters can be of different types.
Return TypeThis is the same as the type of the first input parameter that is not null.
Examples
  • coalesce("123", null, "789") returns "123".
  • coalesce(null, "BBB", "CCC") returns "BBB".
  • coalesce(null, null, null) returns null 

...

Syntax<string> lower   ( < string> str)
Extension TypeFunction
DescriptionConverts the capital letters in the str input string to the equivalent simple letters.
Examplelower("WSO2 cep ") returns "wso2 cep ".

regexp 

Syntax<boolean> regexp   ( < string> str, <string> regex )
Extension TypeFunction
DescriptionReturns true if the given string (i.e. str ) matches the given regular expression (i.e. regex  ). Returns false if the string does not match the regular expression.
Exampleregexp("WSO2 abcdh", "WSO(.*h)") returns true.

repeat 

Syntax<string> repeat   ( < string> str, <int>  times)
Extension TypeFunction
DescriptionRepeats the specified string (i.e. string ) for the specified number of times (i.e. times ).
Examplerepeat("StRing 1", 3) returns "StRing 1StRing 1StRing 1".

...

Syntax<string> reverse   ( < string> str)
Extension TypeFunction
DescriptionReturns the reverse ordered string of  str .
Examplereverse("Hello World") returns "dlroW olleH".

strcmp   

Syntax<int> strcmp ( < string> str, <string>  compareTo)
Extension TypeFunction
DescriptionCompares  str  with  compareTo  strings lexicographically.
Examples
  • strcmp("Hello", 'Hello') returns 0.
  • strcmp("AbCDefghiJ KLMN", 'Hello') returns -7.

...

Syntax<double, double, string> geocode (<string> location  )
Extension TypeStreamProcessor
DescriptionTransforms a location to its geo-coordinates ( longitude and latitude ) and formatted address.
Examplegeocode(duplication rd) returns the following data with adherring latitude, longitude, and formattedAddress attribute names respectively.
6.8995244d, 79.8556202d, "R A De Mel Mawatha, Colombo, Sri Lanka"

...

regex
Anchor
regex
regex

This extension provides basic RegEx execution capabilities to Siddhi. Following are the functions of the RegEx extension.

find

Syntax<bool> find (<string> regex , <string> inputSequence )
Extension TypeFunction
DescriptionThis method attempts to find the next sub-sequence of the inputSequence that matches the regex pattern. It returns true if such a sub sequence exists, or returns false otherwise.
Examples
  • find("\d\d(.*)WSO2", "21 products are produced by WSO2 currently") returns true.
  • find("\d\d(.*)WSO2", "21 products are produced currently") returns false.


Syntax<bool> find (<string> regex , <string> inputSequence, <int> startingIndex )
Extension TypeFunction
DescriptionThis method attempts to find the next sub-sequence of the inputSequence that matches the regex pattern starting from given index (i.e. startingIndex ). It returns true if such a sub sequence exists, or returns false otherwise.
Examples
  • find("\d\d(.*)WSO2", "21 products are produced within 10 years by WSO2 currently by WSO2 employees",  30) returns true.
  • find("\d\d(.*)WSO2", "21 products are produced within 10 years by WSO2 currently by WSO2 employees", 35) returns false.

group

Syntax<string> group (<string> regex , <string> inputSequence , <int>  groupId )
Extension TypeFunction
DescriptionReturns the input sub-sequence captured by the given group during the previous match operation. Returns null if no sub-sequence was found during the previous match operation. For more information about the match operation, see matches.
Example group("(\d\d)(.*)(WSO2.*)", "21 products are produced within 10 years by WSO2 currently by WSO2 employees", 3) returns "WSO2 employees".

lookingAt

Syntax<string> lookingAt (<string> regex , <string> inputSequence )
Extension TypeFunction
DescriptionThis method attempts to match the inputSequence against the regex pattern starting at the beginning.
Examples
  • lookingAt("\d\d(.*)WSO2", "21 products are produced by WSO2 currently in Sri Lanka") returns true.
  • lookingAt("WSO2(.*)middleware(.*)", "sample test string and WSO2 is situated in trace and its a middleware company") returns false.

matches
Anchor
match
match

Syntax<string> matches (<string> regex , <string> inputSequence )
Extension TypeFunction
DescriptionThis method attempts to match the entire inputSequence against the regex pattern.
Examples
  • matches("WSO2(.*)middleware(.*)", "WSO2 is situated in trace and its a middleware company") returns true.
  • matches("WSO2(.*)middleware", "WSO2 is situated in trace and its a middleware company") returns false.

time
Anchor
time
time

This extension provides time related functionality to Siddhi such as getting current time, current date, manipulating/formatting dates, etc. Following are the functions of the time extension.

currentDate

Syntax<string> currentDate ( )
Extension TypeFunction
DescriptionReturns the current system date in the yyyy-MM-dd format.
ExamplecurrentDate() returns 2015-08-20.

currentTime

Syntax<string> currentTime ( )
Extension TypeFunction
DescriptionReturns the current system time in the HH:mm:ss format.
ExamplecurrentTime() returns 13:15:10.

currentTimestamp

Syntax<string> currentTimestamp ( )
Extension TypeFunction
DescriptionReturns the current system timestamp in the yyyy-MM-dd HH:mm:ss format.
ExamplecurrentTime() returns 2015-08-20 13:15:10.

dateAdd

The common parameters of this function are described below.

  • dateValue : A value of date. e.g., "2014-11-11 13:23:44.657", "2014-11-11", "13:23:44.657"
  • expr : The amount by which the selected part of the date format should be incremented. e.g., 2 ,5 ,10 etc.
  • unit : The part of the date format that needs to be manipulated. e.g., "MINUTE" , "HOUR" , "MONTH" , "YEAR" , "QUARTER" , * "WEEK" , "DAY" , "SECOND"
  • dateFormat : The date format of the date value provided. e.g., yyyy-MM-dd HH:mm:ss.SSS
  • timestampInMilliseconds : The date value in milliseconds (from the epoch). e.g., 1415712224000L
Syntax<string> dateAdd (<string> dateValue , <long> expr, <string> unit, <string> dateFormat )
Extension TypeFunction
DescriptionReturns the specified date and time with the selected unit of the specified dateValue incremented by the given amount (i.e. expr ).
ExampledateAdd("2014-11-11 13:23:44", 2, 'year',"yyyy-MM-dd HH:mm:ss") returns "2016-11-11 13:23:44".
Syntax<string> dateAdd (<string>  dateValue  , < long >  expr,  <string>  unit )
Extension TypeFunction
DescriptionReturns the specified date and time with the selected unit of the specified dateValue incremented by the given amount (i.e. expr ).
ExampledateAdd("2014-11-11 13:23:44", 2, 'year') returns "2016-11-11 13:23:44".
Syntax<string> dateAdd (<long>  timestampInMilliseconds, < long >   expr,  <string>  unit )
Extension TypeFunction
DescriptionReturns the specified time stamp with the selected unit of the specified timestampInMilliseconds incremented by the given amount (i.e. expr ).
ExampledateAdd(1415692424000L, 2, 'year') returns "2016-11-11 13:23:44".

dateSub

The common parameters of this function are described below.

  • dateValue : A value of date. e.g., "2014-11-11 13:23:44.657""2014-11-11""13:23:44.657"
  • expr : The amount by which the selected part of the date format should be reduced. e.g., 2 ,5 ,10 etc.
  • unit : The part of the date format that needs to be manipulated. e.g., "MINUTE" , "HOUR" , "MONTH" , "YEAR" , "QUARTER" , * "WEEK" , "DAY" , "SECOND"
  • dateFormat : The date format of the date value provided. e.g., yyyy-MM-dd HH:mm:ss.SSS
  • timestampInMilliseconds : The date value in milliseconds (from the epoch). e.g., 1415712224000L
Syntax<string> dateSub (<string> dateValue , <long> expr, <string> unit, <string> dateFormat )
Extension TypeFunction
DescriptionReturns the specified date and time with the selected unit of the specified dateValue reduced by the given amount (i.e. expr ).
ExampledateSub("2014-11-11 13:23:44", 2, 'year',"yyyy-MM-dd HH:mm:ss") returns "2012-11-11 13:23:44".
Syntax<string> dateSub (<string>  dateValue  , < long >  expr,  <string>  unit )
Extension TypeFunction
DescriptionReturns the specified date and time stamp with he selected unit of the specified dateValue reduced by the given amount (i.e. expr ).
ExampledateSub("2014-11-11 13:23:44", 2, 'year') returns "2012-11-11 13:23:44".
Syntax<string> dateSub (<long>  timestampInMilliseconds,  < long >   expr,  <string>  unit )
Extension TypeFunction
DescriptionReturns the specified time stamp with the selected unit of the specified timestampInMilliseconds   reduced by the given amount (i.e. expr ).
Example dateSub(1415692424000L, 2, 'year') returns 1352620424000.

dateDiff

The common parameters of this function are described below.

  • dateValue1 : A value of date. e.g., "2014-11-11 13:23:44.657", "2014-11-11" , "13:23:44.657"
  • dateValue2 : A value of date. e.g., "2014-11-11 13:23:44.657""2014-11-11" , "13:23:44.657"
  • dateFormat1 : The date format of dateValue1 . e.g., yyyy-MM-dd HH:mm:ss.SSS
  • dateFormat2 : The date format of  dateValue2 . e.g., yyyy-MM-dd HH:mm:ss.SSS
  • timestampInMilliseconds1 : A date value in milliseconds (from the epoch) e.g., 1415712224000L
  • timestampInMilliseconds2 :A date value in milliseconds (from the epoch) e.g., 1415712224000L
Syntax<int> dateDiff (<string> dateValue1 ,  < string >  dateValue2 ,  <string> dateFormat1, <string> dateFormat2 )
Extension TypeFunction
DescriptionReturns the number of days between the two dates specified (i.e. dateValue1 and dateValue2 ).
ExampledateDiff('2014-11-11 13:23:44', '2014-11-9 13:23:44', 'yyyy-MM-dd HH:mm:ss', 'yyyy-MM-dd HH:mm:ss') returns 2.
Syntax<int> dateDiff (<string> dateValue1 ,  < string >  dateValue2 )
Extension TypeFunction
DescriptionReturns the number of days between the two dates specified (i.e. dateValue1 and dateValue2 ).
ExampledateDiff('2014-11-11 13:23:44.000', '2014-11-9 13:23:44.000') returns 2.
Syntax<int> dateDiff (<string> timestampInMilliseconds1 ,  < string >  timestampInMilliseconds2 )
Extension TypeFunction
DescriptionReturns the number of days between the two date and time stamps specified (i.e. timestampInMilliseconds1 and timestampInMilliseconds2 ).
ExampledateDiff(1415692424000, 1415519624000) returns 2.

dateFormat

The common parameters of this function are described below.

  • dateValue : -A value of date. e.g., "2014-11-11 13:23:44.657", "2014-11-11" , "13:23:44.657" 
  • dateTargetFormat : The date format to which the specified date value needs to be converted. e.g., yyyy/MM/dd HH:mm:ss
  • dateSourceFormat : The date format of the date value provided. e.g., yyyy-MM-dd HH:mm:ss.SSS
  • timestampInMilliseconds : A date value in milliseconds (from the epoch) e.g., 1415712224000L
Syntax<string> dateFormat(<string>  dateValue,<string>  dateTargetFormat,<string>  dateSourceFormat)
Extension TypeFunction
DescriptionReturns a formatted date string.
ExampledateFormat('2014-11-11 13:23:55', 'ss', 'yyyy-MM-dd HH:mm:ss') returns 55.
Syntax<string> dateFormat(<string>  dateValue,<string>  dateTargetFormat)
Extension TypeFunction
DescriptionReturns a formatted date string.
ExampledateFormat('2014-11-11 13:23:55.657', 'ss') returns 55.
Syntax<string>  dateFormat (<long> timestampInMilliseconds ,<string>  dateTargetFormat)
Extension TypeFunction
DescriptionReturns a formatted date string.
ExampledateFormat(1415692424000, 'yyyy-MM-dd') returns 2014-11-11.  

extract

  • dateValue : A value of date. e.g., "2014-11-11 13:23:44.657", "2014-11-11" , "13:23:44.657" 
  • unit : The part of the date format that needs to be manipulated. e.g., "MINUTE" , "HOUR" , "MONTH" , "YEAR" , "QUARTER" , * "WEEK" , "DAY" , "SECOND"
  • dateFormat : The date format of the date value provided. e.g., yyyy-MM-dd HH:mm:ss.SSS
  • timestampInMilliseconds : A date value in milliseconds (from the epoch) e.g., 1415712224000L
Syntax<int>  extract (<string> unit ,<string>  dateValue, <string> dataFormat)
Extension TypeFunction
DescriptionReturns the specified unit extracted from the specified dateValue .
Exampleextract('year', '2014-3-11 02:23:44', 'yyyy-MM-dd hh:mm:ss') returns 2014.
Syntax<int>  extract (<string> unit ,<string>  dateValue)
Extension TypeFunction
DescriptionReturns the specified unit extracted from the specified dateValue .
Exampleextract('year', '2014-3-11 02:23:44.234') returns 2014.
Syntax<int>  extract (<long> timestampInMilliseconds ,<string>  unit)
Extension TypeFunction
DescriptionReturns the specified unit extracted from the specified timestampInMilliseconds .
Exampleextract(1394484824000, 'year') returns 2014.

date

Syntax<string>  date (<string> dateValue ,<string>  dateFormat)
Extension TypeFunction
DescriptionReturns the date component of the dateValue.
Exampleextact('2014-11-11 13:23:44', 'yyyy-MM-dd HH:mm:ss') returns 2014-11-11.

timestampInMilliseconds

Syntax<long>  timestampInMilliseconds ()
Extension TypeFunction
DescriptionReturns the current time stamp in milliseconds.
ExampletimestampInMilliseconds() returns 1440160328693.
Syntax<long>  timestampInMilliseconds (<string> dateValue)
Extension TypeFunction
DescriptionReturns the time stamp of the specified dateValue in milliseconds. In order to use this function, the date format of the specified dateValue should be yyyy-MM-dd HH:mm:ss.SSS.
ExampletimestampInMilliseconds('2007-11-30 10:30:19.000') returns 1196398819000.
Syntax<long>  timestampInMilliseconds (<string> dateValue, <string> dateFormat)
Extension TypeFunction
DescriptionReturns the time stamp of the specified dateValue in milliseconds. The date format can be specified in the dateForma t parameter.
ExampletimestampInMilliseconds('2007-11-30 10:30:19', 'yyyy-MM-dd HH:mm:ss') returns 1196398819000.

utcTimestamp

Syntax<string> utcTimestamp()
Extension TypeFunction
DescriptionReturns the system time in the yyyy-MM-dd HH:mm:ss date format.
ExampleutcTimestamp() returns 2015-08-21 12:16:13.

nlp 
Anchor
nlp
nlp

This extension provides Natural Language Processing capabilities to Siddhi. Functions of the NLP extension are as follows.

findNameEntityType

Syntax<string> findNameEntityType(<string> entityType, <bool>  groupSuccessiveMatch, <string> string-variable )
Extension TypeFunction
Description

This function uses the following input parameters.

  • entityType : This is a user-specified string constant. e.g., PERSON, LOCATION, ORGANIZATION, MONEY, PERCENT, DATE or TIME
  • groupSuccessiveMatch : This is a user-specified boolean constant used to group successive matches of the specified entityType and a text stream.
  • streamAttribute : A string or the stream attribute in which text stream is included.

This function returns the entities in the text. If you specify group successive matches as true, the result aggregates successive words of the same entity type.

Example

findNameEntityType("PERSON",true,text)

In the above example, if the text attribute contains "Bill Gates donates £31million to fight Ebola", the result is Bill Gates. If the group successive match is set to false, two events are generated as Bill and Gates.

findNameEntityTypeViaDictionary

Syntax<string> findNameEntityTypeViaDictionary(<string> entityType, <string> dictionaryFilePath,  <string> string-variable )
Extension TypeFunction
Description

This function uses the following input parameters.

  • entityType : This is a user-specified string constant. e.g., PERSON, LOCATION, ORGANIZATION, MONEY, PERCENT, DATE or TIME
  • dictionaryFilePath : The path to the dictionary in which the function searches for the specified entries. The relevant entries for the entity types should be available in the dictionary as shown in the example below.
  • streamAttribute : A string or the stream attribute in which text stream is included.

This function returns the entities in the text. If you specify group successive matches as true , the result aggregates successive words of the same entity type.

Example

findNameEntityTypeViaDictionary("PERSON","dictionary.xml",text)

In the above example, if the text attribute contains "Bill Gates donates £31million to fight Ebola", and the dictionary consists of the above entries (i.e. entries of the example in the Description), the result is "Bill".

findRelationshipByVerb

Syntax<string > text, <string> subject,  < string > object < string >  verb   findRelationshipByVerb (<string> verb, <string> string-variable )
Extension TypeFunction
Description

findRelationshipByVerb takes in a user specified string constant as a verb and a text stream, and returns the whole text, subject, object and the verb based on the specified verb. This information can be extracted only if the verb specified exists in the text stream. However, the tense of the verb does not have to match.

The input parameters used are as follows.

  • verb : This is a user specified string constant.
  • string-variable : A string or the stream attribute which includes the text stream.
Examples

findRelationshipByVerb("say", "Information just reaching us says another Liberian With Ebola Arrested At Lagos Airport") returns the following.

  • The whole text
  • Information as the subject
  • Liberian as the object.
  • says as the verb.

findRelationshipByRegex

Syntax<string > text, <string> subject,  < string > object < string >  verb   findRelationshipByRegex (<string> regex, <string> string-variable )
Extension TypeFunction
DescriptionThis function returns the whole text, subject, object and verb from the text stream that matches the named nodes of the Semgrex pattern.
Example

findRelationshpByRegex('{}=verb >/nsubj|agent/ {}=subject >/dobj/ {}=object', "gates foundation donates $50M in support of #Ebola relief") returns the following.

  • The whole text
  • "foundation" as the subject
  • "$" as the object
  • "donates" as the verb

findSemgrexPattern

Syntax<string > text, <string> match,  < string > object < string >  verb   findSemgrexPattern (<string> regex, <string> string-variable )
Extension TypeFunction
Description

The  findSemgrexPattern function returns the whole text, subject, object and verb from the text stream that matches the named nodes of the Semgrex pattern.

This function uses the following input parameters.

  • regex : A user specified regular expression that matches the Semgrex pattern syntax.
  • string-variable : A string or the stream attribute which includes the text stream.
Example

findSemgrexPattern('{lemma:die} >/.*subj|num.*/=reln {}=diedsubject', "Sierra Leone doctor dies of Ebola after failed evacuation.")

In this example, the function searches for words with the lemmatization die that are governors on any subject or numeric relation. The dependent is marked as the diedsubject, and the relationship is marked as reln. Thus, the query returns an output stream that has the full match of this expression, i.e. the governing word with lemmatization for die. It also returns the name of the corresponding node for each match it finds.

The following is the list of elements in the output stream.

  • The whole text
  • dies as the match
  • "nsubj" as reln
  • doctor asdiedsubject

findTokensRegexPattern

Syntax< string > text, <string> match, <string>  group_1, etc.   findTokensRegexPattern (<string> regex, <string> string-variable )
Extension TypeFunction
Description

findTokensRegexPattern returns the whole text, subject, object and verb from the text stream that matches the named nodes of the Semgrex pattern. The return also includes the corresponding node in the Semgrex pattern and the corresponding named relation defined in the regular expression for each word/phrase.

This function uses the following input parameters.

  • regex : A user specified regular expression that matches the Semgrex pattern syntax.
  • string-variable : A string or the stream attribute which includes the text stream.
Example

findTokensRegexPattern('([ner:/PERSON|ORGANIZATION|LOCATION/]+) (?:[]* [lemma:donate]) ([ner:MONEY]+)', text) defines three groups:

  • The first group looks for words that are entities of  either PERSON, ORGANIZATON or LOCATION with one or more successive words matching same.
  • The middle group is defined as the non capturing group.
  • Third looks for one or more successive entities of type MONEY.

This function returns the following.

  • The whole text
  • " Paul Allen donates $ 9million " as the match.
  • " Paul Allen", as group_1.
  • "$ 9million" as group_2.

pmml 
Anchor
pmml
pmml

This extension adds PMML based predictive analytic model compliance to Siddhi. It allows you to make predictions based on a predictive analytic model. Supported functions of PMML extension are as follows.

predict

Syntax< double | float|long|int|string|boolean > predict(<string> pathToPmmlFile)
Extension TypeStream Processor
Description

Processes the input stream attributes according to the defined PMML standard model and outputs the processed results together with the input stream attributes.

This function uses the following input parameter.

  • pathToPmmlFile : The path to the PMML model file.

The function returns the outputs defined in the output fields. The number of outputs can vary.

Example

predict('<CEP <DAS HOME>/samples/artifacts/0301/decision-tree.pmml')

This model is implemented to detect network intruders. The input event stream is processed by the execution plan that uses the pmml predictive model to detect whether a particular user is an intruder to the network or not. The output stream contains the processed query results that include the predicted responses together with the feature values extracted from the input event stream.

Syntax< double | float|long|int|string|boolean > predict(<string> pathToPmmlFile,  <double|float|long|int|string|boolean> input )
Extension TypeStream Processor
Description

Processes the input stream attributes according to the defined PMML standards model and outputs the processed results.

This function uses the following input parameters.

  • pathToPmmlFile : The path to the PMML model file.
  • input : An attribute of the input stream that is sent to the PMML standard model as a value to based on which the prediction is made. The predict function does not accept any constant values as input parameters. You can have multiple input parameters according to the input stream definition.

This function returns the processed outputs defined in the query. The number of outputs can vary depending on the query definition.

Examples

predict('<CEP <DAS_HOME>/samples/artifacts/0301/decision-tree.pmml', root_shell double, su_attempted double, num_root double, num_file_creations double, num_shells double, num_access_files double, num_outbound_cmds double, is_host_login double, is_guest_login double, count double, srv_count double, serror_rate double, srv_serror_rate double)

This model is implemented to detect network intruders. The input event stream is processed by the execution plan that uses the pmml predictive model to detect whether a particular user is an intruder to the network or not. The output stream contains the processed query results that include the predicted responses.

 

...