Following are the Siddhi extensions you can use in processing events using WSO2 CEPDAS.
Excerpt |
---|
|
1.FORMAT FOR FUNCTIONS functionName<returnType r1> functionName ( < paramType> p1) (FOR ONE RETURN TYPE & ONE PARAMETER TYPE)
< returnType | returnType > functionName(<paramType|paramType> p1, <paramType|paramType> p2) (FOR ONE RETURN TYPE & MANY PARAMETER TYPES)
[< returnType | returnType > r1, < returnType | returnType > r2 ] functionName(<paramType|paramType> p1) (FOR MANY RETURN TYPES & ONE PARAMETER TYPE) - Extension Type: Function | Aggregate Function | Stream Function | Stream Processor | Window
- Description:
- Parameter : p1 :
- Return Parameter : r1:
- Example:
functionName<returnType> functionName ( < paramType> p1)(FOR ONE RETURN TYPE & ONE PARAMETER TYPE)
<returnType|returnType> functionName(<paramType|paramType> p1, <paramType|paramType> p2)(FOR ONE RETURN TYPE & MANY PARAMETER TYPES) [< returnType | returnType > r1, < returnType | returnType > r2 ] functionName(<paramType|paramType> p1)(FOR MANY RETURN TYPES & ONE PARAMETER TYPE) - Extension Type: Function | Aggregate Function | Stream Function | Aggregate Stream Function | Window
- Description:
- Parameter : p1 :
- Return Parameter : r1 :
- Example:
|
...
Syntax | <double> abs(<float|double> p1) |
---|
Extension Type | Function |
---|
Description | Returns the absolute value of p1 . This function wraps the java.lang.Math.abs() function. |
---|
Examples | Both the following queries return 3 since the absolute value of both 3 and -3 is 3 . |
---|
acos
Syntax | <double> acos(<float|double> p1) |
---|
Extension Type | Function |
---|
Description | If -1 <= p1 <= 1, this function returns the arc-cosine (inverse cosine) of p1 . If not, it returns NULL. The return value is in radian scale. This function wraps the java.lang.Math.acos() function. |
---|
Example | acos(0.5) returns 1.0471975511965979 . |
---|
...
Syntax | <double> asin (<float|double> p1) |
---|
Extension Type | Function |
---|
Description | If -1 <= p1 <= 1, this function returns the arc-sin (inverse sine) of p1 . If not, it returns NULL . The return value is in radian scale. This function wraps the java.lang.Math.asin() function. |
---|
Example | asin(0.5) returns 0.5235987755982989 . |
---|
atan
Syntax | <double> atan(<int|long|float|double> p1) |
---|
Extension Type | Function |
---|
Description | Returns the arc-tangent (inverse tangent) of p1 . The return value is in radian scale. This function wraps the java.lang.Math.atan() function. |
---|
Examples | atan(6d) returns 1.4056476493802699 . |
---|
...
Syntax | <double> cbrt(<int|long|float|double> p1) |
---|
Extension Type | Function |
---|
Description | Returns the cube-root of p1 ( p1 is in radians). This function wraps the java.lang.Math.cbrt() function. |
---|
Example | cbrt(17d) returns 2.5712815906582356 . |
---|
e
Syntax | <double> e() |
---|
Extension Type | Function |
---|
Description | Returns the java.lang.Math.E constant, which is the closest double value to e (which is the base of the natural logarithms). |
---|
Example | e() returns 2.7182818284590452354 . |
---|
...
Syntax | <boolean> isInfinite(<float|double> p1) |
---|
Extension Type | Function |
---|
Description | This function wraps the java.lang.Float.isInfinite() and java.lang.Double.isInfinite() functions that return true if p1 is infinitely large in magnitude, or return false otherwise. |
---|
Example | isInfinite(java.lang.Double.POSITIVE_INFINITY) returns true . |
---|
isNan
Syntax | < boolean> isNan(<float|double> p1) |
---|
Extension Type | Function |
---|
Description | This function wraps the java.lang.Float.isNaN() and java.lang.Double.isNaN() functions that return true if p1 is a NaN (Not-a-Number) value, or return false otherwise. |
---|
Example | isNan(java.lang.Math.log(-12d) ) returns true . |
---|
...
Syntax | <double> pi ( ) |
---|
Extension Type | Function |
---|
Description | Returns the java.lang.Math.PI constant, which is the closest value to pi (i.e. the ratio of the circumference of a circle to its diameter). |
---|
Example | pi() always returns 3.141592653589793 . |
---|
power
Syntax | <double> power ( < int|long|float|double> value, <int|long|float|double> toPower ) |
---|
Extension Type | Function |
---|
Description | Returns value raised to the power of toPower . |
---|
Example | power(5.6d, 3.0d) returns 175.61599999999996 . |
---|
...
Syntax | <double> sin (< int|long|float|double > p1) |
---|
Extension Type | Function |
---|
Description | Returns the sine of p1 ( p1 is in radians). This function wraps the java.lang.Math.sin() function. |
---|
Example | sin(6d ) returns -0.27941549819892586. |
---|
sinh
Syntax | <double> sinh (< int|long|float|double > p1) |
---|
Extension Type | Function |
---|
Description | Returns the hyperbolic sine of p1 ( p1 is in radians). This function wraps the java.lang.Math.sinh() function. |
---|
Example | sinh(6d) returns 201.71315737027922 . |
---|
sqrt
Syntax | <double> sqrt (< int|long|float|double > p1) |
---|
Extension Type | Function |
---|
Description | Returns the square-root of p1 . This function wraps the java.lang.Math.sqrt() function. |
---|
Example | sqrt(4d) returns 2 . |
---|
tan
Syntax | <double> tan (< int|long|float|double > p1) |
---|
Extension Type | Function |
---|
Description | Returns the tan of p1 ( p1 is in radians). This function wraps the java.lang.Math.tan() function. |
---|
Example | tan(6d) returns -0.29100619138474915 . |
---|
...
Syntax | <string> charAt(<string> str, <int> index) |
---|
Extension Type | Function |
---|
Description | Returns the char value as a string value at the specified index. |
---|
Example | charAt("WSO2", 1) returns 'S' . |
---|
coalesce
Syntax | < int|long|float|double|string|boolean > coalesce (< int|long|float|double|string|boolean > arg1, < int|long|float|double|string|boolean > arg2 ,.., < int|long|float|double|string|boolean > argN ) |
---|
Extension Type | Function |
---|
Description | Returns the value of the first of its input parameters that is not null. |
---|
Parameters | This functions accepts any number of parameters. The parameters can be of different types. |
---|
Return Type | This is the same as the type of the first input parameter that is not null. |
---|
Examples | coalesce("123", null, "789") returns "123" .coalesce(null, "BBB", "CCC") returns "BBB" .coalesce(null, null, null) returns null .
|
---|
...
Syntax | <string> lower ( < string> str) |
---|
Extension Type | Function |
---|
Description | Converts the capital letters in the str input string to the equivalent simple letters. |
---|
Example | lower("WSO2 cep ") returns "wso2 cep " . |
---|
regexp
Syntax | <boolean> regexp ( < string> str, <string> regex ) |
---|
Extension Type | Function |
---|
Description | Returns true if the given string (i.e. str ) matches the given regular expression (i.e. regex ). Returns false if the string does not match the regular expression. |
---|
Example | regexp("WSO2 abcdh", "WSO(.*h)") returns true . |
---|
repeat
Syntax | <string> repeat ( < string> str, <int> times) |
---|
Extension Type | Function |
---|
Description | Repeats the specified string (i.e. string ) for the specified number of times (i.e. times ). |
---|
Example | repeat("StRing 1", 3) returns "StRing 1StRing 1StRing 1" . |
---|
...
Syntax | <string> reverse ( < string> str) |
---|
Extension Type | Function |
---|
Description | Returns the reverse ordered string of str . |
---|
Example | reverse("Hello World") returns "dlroW olleH" . |
---|
strcmp
Syntax | <int> strcmp ( < string> str, <string> compareTo) |
---|
Extension Type | Function |
---|
Description | Compares str with compareTo strings lexicographically. |
---|
Examples | strcmp("Hello", 'Hello') returns 0 .strcmp("AbCDefghiJ KLMN", 'Hello') returns -7 .
|
---|
...
Syntax | <double, double, string> geocode (<string> location ) |
---|
Extension Type | StreamProcessor |
---|
Description | Transforms a location to its geo-coordinates ( longitude and latitude ) and formatted address. |
---|
Example | geocode(duplication rd) returns the following data with adherring latitude, longitude, and formattedAddress attribute names respectively. 6.8995244d, 79.8556202d, "R A De Mel Mawatha, Colombo, Sri Lanka" |
---|
...
regex
This extension provides basic RegEx execution capabilities to Siddhi. Following are the functions of the RegEx extension.
find
Syntax | <bool> find (<string> regex , <string> inputSequence ) |
---|
Extension Type | Function |
---|
Description | This method attempts to find the next sub-sequence of the inputSequence that matches the regex pattern. It returns true if such a sub sequence exists, or returns false otherwise. |
---|
Examples | find("\d\d(.*)WSO2", "21 products are produced by WSO2 currently") returns true .find("\d\d(.*)WSO2", "21 products are produced currently") returns false .
|
---|
Syntax | <bool> find (<string> regex , <string> inputSequence, <int> startingIndex ) |
---|
Extension Type | Function |
---|
Description | This method attempts to find the next sub-sequence of the inputSequence that matches the regex pattern starting from given index (i.e. startingIndex ). It returns true if such a sub sequence exists, or returns false otherwise. |
---|
Examples | find("\d\d(.*)WSO2", "21 products are produced within 10 years by WSO2 currently by WSO2 employees", 30) returns true .find("\d\d(.*)WSO2", "21 products are produced within 10 years by WSO2 currently by WSO2 employees", 35) returns false .
|
---|
group
Syntax | <string> group (<string> regex , <string> inputSequence , <int> groupId ) |
---|
Extension Type | Function |
---|
Description | Returns the input sub-sequence captured by the given group during the previous match operation. Returns null if no sub-sequence was found during the previous match operation. For more information about the match operation, see matches. |
---|
Example | group("(\d\d)(.*)(WSO2.*)", "21 products are produced within 10 years by WSO2 currently by WSO2 employees", 3) returns "WSO2 employees" . |
---|
lookingAt
Syntax | <string> lookingAt (<string> regex , <string> inputSequence ) |
---|
Extension Type | Function |
---|
Description | This method attempts to match the inputSequence against the regex pattern starting at the beginning. |
---|
Examples | lookingAt("\d\d(.*)WSO2", "21 products are produced by WSO2 currently in Sri Lanka" ) returns true .lookingAt("WSO2(.*)middleware(.*)", "sample test string and WSO2 is situated in trace and its a middleware company") returns false .
|
---|
matches
Syntax | <string> matches (<string> regex , <string> inputSequence ) |
---|
Extension Type | Function |
---|
Description | This method attempts to match the entire inputSequence against the regex pattern. |
---|
Examples | matches("WSO2(.*)middleware(.*)", "WSO2 is situated in trace and its a middleware company") returns true .matches("WSO2(.*)middleware", "WSO2 is situated in trace and its a middleware company") returns false .
|
---|
time
This extension provides time related functionality to Siddhi such as getting current time, current date, manipulating/formatting dates, etc. Following are the functions of the time extension.
currentDate
Syntax | <string> currentDate ( ) |
---|
Extension Type | Function |
---|
Description | Returns the current system date in the yyyy-MM-dd format. |
---|
Example | currentDate() returns 2015-08-20 . |
---|
currentTime
Syntax | <string> currentTime ( ) |
---|
Extension Type | Function |
---|
Description | Returns the current system time in the HH:mm:ss format. |
---|
Example | currentTime() returns 13:15:10 . |
---|
currentTimestamp
Syntax | <string> currentTimestamp ( ) |
---|
Extension Type | Function |
---|
Description | Returns the current system timestamp in the yyyy-MM-dd HH:mm:ss format. |
---|
Example | currentTime() returns 2015-08-20 13:15:10 . |
---|
dateAdd
The common parameters of this function are described below.
-
dateValue
: A value of date. e.g., "2014-11-11 13:23:44.657"
, "2014-11-11"
, "13:23:44.657"
-
expr
: The amount by which the selected part of the date format should be incremented. e.g., 2 ,5 ,10 etc. -
unit
: The part of the date format that needs to be manipulated. e.g., "MINUTE"
, "HOUR"
, "MONTH"
, "YEAR"
, "QUARTER"
, * "WEEK"
, "DAY"
, "SECOND"
-
dateFormat
: The date format of the date value provided. e.g., yyyy-MM-dd HH:mm:ss.SSS
-
timestampInMilliseconds
: The date value in milliseconds (from the epoch). e.g., 1415712224000L
Syntax | <string> dateAdd (<string> dateValue , <long> expr, <string> unit, <string> dateFormat ) |
---|
Extension Type | Function |
---|
Description | Returns the specified date and time with the selected unit of the specified dateValue incremented by the given amount (i.e. expr ). |
---|
Example | dateAdd("2014-11-11 13:23:44", 2, 'year',"yyyy-MM-dd HH:mm:ss") returns "2016-11-11 13:23:44" . |
---|
Syntax | <string> dateAdd (<string> dateValue , < long > expr, <string> unit ) |
---|
Extension Type | Function |
---|
Description | Returns the specified date and time with the selected unit of the specified dateValue incremented by the given amount (i.e. expr ). |
---|
Example | dateAdd("2014-11-11 13:23:44", 2, 'year') returns "2016-11-11 13:23:44" . |
---|
Syntax | <string> dateAdd (<long> timestampInMilliseconds, < long > expr, <string> unit ) |
---|
Extension Type | Function |
---|
Description | Returns the specified time stamp with the selected unit of the specified timestampInMilliseconds incremented by the given amount (i.e. expr ). |
---|
Example | dateAdd(1415692424000L, 2, 'year') returns "2016-11-11 13:23:44" . |
---|
dateSub
The common parameters of this function are described below.
-
dateValue
: A value of date. e.g., "2014-11-11 13:23:44.657"
, "2014-11-11"
, "13:23:44.657"
-
expr
: The amount by which the selected part of the date format should be reduced. e.g., 2 ,5 ,10 etc. -
unit
: The part of the date format that needs to be manipulated. e.g., "MINUTE"
, "HOUR"
, "MONTH"
, "YEAR"
, "QUARTER"
, * "WEEK"
, "DAY"
, "SECOND"
-
dateFormat
: The date format of the date value provided. e.g., yyyy-MM-dd HH:mm:ss.SSS
-
timestampInMilliseconds
: The date value in milliseconds (from the epoch). e.g., 1415712224000L
Syntax | <string> dateSub (<string> dateValue , <long> expr, <string> unit, <string> dateFormat ) |
---|
Extension Type | Function |
---|
Description | Returns the specified date and time with the selected unit of the specified dateValue reduced by the given amount (i.e. expr ). |
---|
Example | dateSub("2014-11-11 13:23:44", 2, 'year',"yyyy-MM-dd HH:mm:ss") returns "2012-11-11 13:23:44" . |
---|
Syntax | <string> dateSub (<string> dateValue , < long > expr, <string> unit ) |
---|
Extension Type | Function |
---|
Description | Returns the specified date and time stamp with he selected unit of the specified dateValue reduced by the given amount (i.e. expr ). |
---|
Example | dateSub("2014-11-11 13:23:44", 2, 'year') returns "2012-11-11 13:23:44". |
---|
Syntax | <string> dateSub (<long> timestampInMilliseconds, < long > expr, <string> unit ) |
---|
Extension Type | Function |
---|
Description | Returns the specified time stamp with the selected unit of the specified timestampInMilliseconds reduced by the given amount (i.e. expr ). |
---|
Example | dateSub(1415692424000L, 2, 'year') returns 1352620424000. |
---|
dateDiff
The common parameters of this function are described below.
-
dateValue1
: A value of date. e.g., "2014-11-11 13:23:44.657"
, "2014-11-11"
, "13:23:44.657"
-
dateValue2
: A value of date. e.g., "2014-11-11 13:23:44.657"
, "2014-11-11"
, "13:23:44.657"
-
dateFormat1
: The date format of dateValue1
. e.g., yyyy-MM-dd HH:mm:ss.SSS
-
dateFormat2
: The date format of dateValue2
. e.g., yyyy-MM-dd HH:mm:ss.SSS
-
timestampInMilliseconds1
: A date value in milliseconds (from the epoch) e.g., 1415712224000L
-
timestampInMilliseconds2
:A date value in milliseconds (from the epoch) e.g., 1415712224000L
Syntax | <int> dateDiff (<string> dateValue1 , < string > dateValue2 , <string> dateFormat1, <string> dateFormat2 ) |
---|
Extension Type | Function |
---|
Description | Returns the number of days between the two dates specified (i.e. dateValue1 and dateValue2 ). |
---|
Example | dateDiff('2014-11-11 13:23:44', '2014-11-9 13:23:44', 'yyyy-MM-dd HH:mm:ss', 'yyyy-MM-dd HH:mm:ss') returns 2 . |
---|
Syntax | <int> dateDiff (<string> dateValue1 , < string > dateValue2 ) |
---|
Extension Type | Function |
---|
Description | Returns the number of days between the two dates specified (i.e. dateValue1 and dateValue2 ). |
---|
Example | dateDiff('2014-11-11 13:23:44.000', '2014-11-9 13:23:44.000' ) returns 2 . |
---|
Syntax | <int> dateDiff (<string> timestampInMilliseconds1 , < string > timestampInMilliseconds2 ) |
---|
Extension Type | Function |
---|
Description | Returns the number of days between the two date and time stamps specified (i.e. timestampInMilliseconds1 and timestampInMilliseconds2 ). |
---|
Example | dateDiff(1415692424000, 1415519624000) returns 2 . |
---|
The common parameters of this function are described below.
-
dateValue
: -A value of date. e.g., "2014-11-11 13:23:44.657
", "2014-11-11"
, "13:23:44.657"
-
dateTargetFormat
: The date format to which the specified date value
needs to be converted. e.g., yyyy/MM/dd HH:mm:ss
-
dateSourceFormat
: The date format of the date value
provided. e.g., yyyy-MM-dd HH:mm:ss.SSS
-
timestampInMilliseconds
: A date value in milliseconds (from the epoch) e.g., 1415712224000L
Syntax | <string> dateFormat(<string> dateValue,<string> dateTargetFormat,<string> dateSourceFormat) |
---|
Extension Type | Function |
---|
Description | Returns a formatted date string. |
---|
Example | dateFormat('2014-11-11 13:23:55', 'ss', 'yyyy-MM-dd HH:mm:ss') returns 55 . |
---|
Syntax | <string> dateFormat(<string> dateValue,<string> dateTargetFormat) |
---|
Extension Type | Function |
---|
Description | Returns a formatted date string. |
---|
Example | dateFormat('2014-11-11 13:23:55.657', 'ss') returns 55 . |
---|
Syntax | <string> dateFormat (<long> timestampInMilliseconds ,<string> dateTargetFormat) |
---|
Extension Type | Function |
---|
Description | Returns a formatted date string. |
---|
Example | dateFormat(1415692424000, 'yyyy-MM-dd') returns 2014-11-11. |
---|
-
dateValue
: A value of date. e.g., "2014-11-11 13:23:44.657"
, "2014-11-11"
, "13:23:44.657"
-
unit
: The part of the date format that needs to be manipulated. e.g., "MINUTE"
, "HOUR"
, "MONTH"
, "YEAR"
, "QUARTER"
, * "WEEK"
, "DAY"
, "SECOND"
-
dateFormat
: The date format of the date value provided. e.g., yyyy-MM-dd HH:mm:ss.SSS
-
timestampInMilliseconds
: A date value in milliseconds (from the epoch) e.g., 1415712224000L
Syntax | <int> extract (<string> unit ,<string> dateValue, <string> dataFormat) |
---|
Extension Type | Function |
---|
Description | Returns the specified unit extracted from the specified dateValue . |
---|
Example | extract('year', '2014-3-11 02:23:44', 'yyyy-MM-dd hh:mm:ss') returns 2014 . |
---|
Syntax | <int> extract (<string> unit ,<string> dateValue) |
---|
Extension Type | Function |
---|
Description | Returns the specified unit extracted from the specified dateValue . |
---|
Example | extract('year', '2014-3-11 02:23:44.234') returns 2014 . |
---|
Syntax | <int> extract (<long> timestampInMilliseconds ,<string> unit) |
---|
Extension Type | Function |
---|
Description | Returns the specified unit extracted from the specified timestampInMilliseconds . |
---|
Example | extract(1394484824000, 'year') returns 2014 . |
---|
date
Syntax | <string> date (<string> dateValue ,<string> dateFormat) |
---|
Extension Type | Function |
---|
Description | Returns the date component of the dateValue . |
---|
Example | extact('2014-11-11 13:23:44', 'yyyy-MM-dd HH:mm:ss') returns 2014-11-11 . |
---|
timestampInMilliseconds
Syntax | <long> timestampInMilliseconds () |
---|
Extension Type | Function |
---|
Description | Returns the current time stamp in milliseconds. |
---|
Example | timestampInMilliseconds() returns 1440160328693 . |
---|
Syntax | <long> timestampInMilliseconds (<string> dateValue) |
---|
Extension Type | Function |
---|
Description | Returns the time stamp of the specified dateValue in milliseconds. In order to use this function, the date format of the specified dateValue should be yyyy-MM-dd HH:mm:ss.SSS . |
---|
Example | timestampInMilliseconds('2007-11-30 10:30:19.000') returns 1196398819000 . |
---|
Syntax | <long> timestampInMilliseconds (<string> dateValue, <string> dateFormat) |
---|
Extension Type | Function |
---|
Description | Returns the time stamp of the specified dateValue in milliseconds. The date format can be specified in the dateForma t parameter. |
---|
Example | timestampInMilliseconds('2007-11-30 10:30:19', 'yyyy-MM-dd HH:mm:ss') returns 1196398819000 . |
---|
utcTimestamp
Syntax | <string> utcTimestamp() |
---|
Extension Type | Function |
---|
Description | Returns the system time in the yyyy-MM-dd HH:mm:ss date format. |
---|
Example | utcTimestamp() returns 2015-08-21 12:16:13 . |
---|
nlp
This extension provides Natural Language Processing capabilities to Siddhi. Functions of the NLP extension are as follows.
findNameEntityType
Syntax | <string> findNameEntityType(<string> entityType, <bool> groupSuccessiveMatch, <string> string-variable ) |
---|
Extension Type | Function |
---|
Description | This function uses the following input parameters. entityType : This is a user-specified string constant. e.g., PERSON , LOCATION , ORGANIZATION , MONEY , PERCENT , DATE or TIME -
groupSuccessiveMatch : This is a user-specified boolean constant used to group successive matches of the specified entityType and a text stream. streamAttribute : A string or the stream attribute in which text stream is included.
This function returns the entities in the text. If you specify group successive matches as true , the result aggregates successive words of the same entity type. |
---|
Example | findNameEntityType("PERSON",true,text)
In the above example, if the text attribute contains "Bill Gates donates £31million to fight Ebola" , the result is Bill Gates . If the group successive match is set to false , two events are generated as Bill and Gates . |
---|
findNameEntityTypeViaDictionary
Syntax | <string> findNameEntityTypeViaDictionary(<string> entityType, <string> dictionaryFilePath, <string> string-variable ) |
---|
Extension Type | Function |
---|
Description | This function uses the following input parameters. entityType : This is a user-specified string constant. e.g., PERSON , LOCATION , ORGANIZATION , MONEY , PERCENT , DATE or TIME dictionaryFilePath : The path to the dictionary in which the function searches for the specified entries. The relevant entries for the entity types should be available in the dictionary as shown in the example below. streamAttribute : A string or the stream attribute in which text stream is included.
This function returns the entities in the text. If you specify group successive matches as true , the result aggregates successive words of the same entity type. |
---|
Example | findNameEntityTypeViaDictionary("PERSON","dictionary.xml",text)
In the above example, if the text attribute contains "Bill Gates donates £31million to fight Ebola" , and the dictionary consists of the above entries (i.e. entries of the example in the Description), the result is "Bill" . |
---|
findRelationshipByVerb
Syntax | <string > text, <string> subject, < string > object < string > verb findRelationshipByVerb (<string> verb, <string> string-variable ) |
---|
Extension Type | Function |
---|
Description | findRelationshipByVerb takes in a user specified string constant as a verb and a text stream, and returns the whole text, subject, object and the verb based on the specified verb. This information can be extracted only if the verb specified exists in the text stream. However, the tense of the verb does not have to match.
The input parameters used are as follows. -
verb : This is a user specified string constant. -
string-variable : A string or the stream attribute which includes the text stream.
|
---|
Examples | findRelationshipByVerb("say", "Information just reaching us says another Liberian With Ebola Arrested At Lagos Airport") returns the following.
- The whole text
Information as the subjectLiberian as the object.says as the verb.
|
---|
findRelationshipByRegex
Syntax | <string > text, <string> subject, < string > object < string > verb findRelationshipByRegex (<string> regex, <string> string-variable ) |
---|
Extension Type | Function |
---|
Description | This function returns the whole text, subject, object and verb from the text stream that matches the named nodes of the Semgrex pattern. |
---|
Example | findRelationshpByRegex('{}=verb >/nsubj|agent/ {}=subject >/dobj/ {}=object', "gates foundation donates $50M in support of #Ebola relief") returns the following.
- The whole text
"foundation" as the subject"$" as the object"donates" as the verb
|
---|
findSemgrexPattern
Syntax | <string > text, <string> match, < string > object < string > verb findSemgrexPattern (<string> regex, <string> string-variable ) |
---|
Extension Type | Function |
---|
Description | The findSemgrexPattern function returns the whole text, subject, object and verb from the text stream that matches the named nodes of the Semgrex pattern. This function uses the following input parameters. -
regex : A user specified regular expression that matches the Semgrex pattern syntax. -
string-variable : A string or the stream attribute which includes the text stream.
|
---|
Example | findSemgrexPattern('{lemma:die} >/.*subj|num.*/=reln {}=diedsubject', "Sierra Leone doctor dies of Ebola after failed evacuation.")
In this example, the function searches for words with the lemmatization die that are governors on any subject or numeric relation. The dependent is marked as the diedsubject , and the relationship is marked as reln . Thus, the query returns an output stream that has the full match of this expression, i.e. the governing word with lemmatization for die . It also returns the name of the corresponding node for each match it finds. The following is the list of elements in the output stream. - The whole text
dies as the match"nsubj" as reln doctor asdiedsubject
|
---|
findTokensRegexPattern
Syntax | < string > text, <string> match, <string> group_1, etc. findTokensRegexPattern (<string> regex, <string> string-variable ) |
---|
Extension Type | Function |
---|
Description | findTokensRegexPattern returns the whole text, subject, object and verb from the text stream that matches the named nodes of the Semgrex pattern. The return also includes the corresponding node in the Semgrex pattern and the corresponding named relation defined in the regular expression for each word/phrase.
This function uses the following input parameters. -
regex : A user specified regular expression that matches the Semgrex pattern syntax. -
string-variable : A string or the stream attribute which includes the text stream.
|
---|
Example | findTokensRegexPattern('([ner:/PERSON|ORGANIZATION|LOCATION/]+) (?:[]* [lemma:donate]) ([ner:MONEY]+)', text) defines three groups:
- The first group looks for words that are entities of either
PERSON , ORGANIZATON or LOCATION with one or more successive words matching same. - The middle group is defined as the non capturing group.
- Third looks for one or more successive entities of type
MONEY .
This function returns the following. The whole text " Paul Allen donates $ 9million " as the match." Paul Allen" , as group_1."$ 9million" as group_2.
|
---|
pmml
This extension adds PMML based predictive analytic model compliance to Siddhi. It allows you to make predictions based on a predictive analytic model. Supported functions of PMML extension are as follows.
predict
Syntax | < double | float|long|int|string|boolean > predict(<string> pathToPmmlFile) |
---|
Extension Type | Stream Processor |
---|
Description | Processes the input stream attributes according to the defined PMML standard model and outputs the processed results together with the input stream attributes. This function uses the following input parameter. -
pathToPmmlFile : The path to the PMML model file.
The function returns the outputs defined in the output fields. The number of outputs can vary. |
---|
Example | predict('<CEP <DAS HOME>/samples/artifacts/0301/decision-tree.pmml')
This model is implemented to detect network intruders. The input event stream is processed by the execution plan that uses the pmml predictive model to detect whether a particular user is an intruder to the network or not. The output stream contains the processed query results that include the predicted responses together with the feature values extracted from the input event stream. |
---|
Syntax | < double | float|long|int|string|boolean > predict(<string> pathToPmmlFile, <double|float|long|int|string|boolean> input ) |
---|
Extension Type | Stream Processor |
---|
Description | Processes the input stream attributes according to the defined PMML standards model and outputs the processed results. This function uses the following input parameters. -
pathToPmmlFile : The path to the PMML model file. -
input : An attribute of the input stream that is sent to the PMML standard model as a value to based on which the prediction is made. The predict function does not accept any constant values as input parameters. You can have multiple input parameters according to the input stream definition.
This function returns the processed outputs defined in the query. The number of outputs can vary depending on the query definition. |
---|
Examples | predict('<CEP <DAS_HOME>/samples/artifacts/0301/decision-tree.pmml', root_shell double, su_attempted double, num_root double, num_file_creations double, num_shells double, num_access_files double, num_outbound_cmds double, is_host_login double, is_guest_login double, count double, srv_count double, serror_rate double, srv_serror_rate double)
This model is implemented to detect network intruders. The input event stream is processed by the execution plan that uses the pmml predictive model to detect whether a particular user is an intruder to the network or not. The output stream contains the processed query results that include the predicted responses. |
---|
...