Facets and score parameters are used to categorize data based on attributes of an event stream in WSO2 DAS.
Facets
A facet is an attribute of indexed records which is used to classify the records by the attribute value. Facet attributes allow you to carry out a faceted extended search within the defined categories. There is no data type called FACET in event streams. Any STRING type field, of which you can define the attribute value as a JSON array can be indexed as a facet. Facets are defined when the table schema is created during the persisting of event streams.
Facets are used in the implementations of the following REST APIs.
- Drilling Down Through Categories via REST API
- Retrieving Specific Records through a Drill Down Search via REST API
- Retrieving the Number of Records Matching the Drill Down Criteria via REST API
- Retrieving the Event Count of Range Facets
Facet usage types
Facets are used in the Analytics REST API and Data Explorer in WSO2 DAS. Different usage types of facets are described below.
Searching by an attribute
This denotes implementing an attribute of a set of records as a facet. For an example, in a record which represents a book, you can define the AUTHOR
field as a facet when you are persisting the event stream as shown below.
Click the Simulate option of the event stream to simulate sending events to the created event stream.
For the attribute which you defined as a facet, you need to send its values as a JSON string array as shown in the example below .
You can use this facet to retrieve the records of the books which are written by a particular author using the Data Explorer as shown in the example below.
You can perform the above search in the Analytics REST API using the following request. For more information, see Drilling Down Through Categories via REST API.
POST https://localhost:9443/analytics/drillDown { "tableName": "BOOK_STORE", "categories": [ { "fieldName": "AUTHOR", "path" : ["C.Dickens"] } ], "query" : "timestamp : [1243214324532 TO 4654365223]", "recordStart" : 0, "recordCount" : 100 }
Extracting sub categories
Another use of facets is to extract the sub categories of a category. You can retrieve the immediate sub-categories of a given category (which are represented in a JSON array), using the relevant API. The API returns the immediate subcategories of the given category in the corresponding table.
For an example, in a record which represents a book, you can define the PUBLISHED_DATE
field as a facet when you are persisting the event stream as shown below.
Click the Simulate option of the event stream to simulate sending events to the created even stream.
Send its values as a JSON string array for the attribute which you defined as a facet as shown in the example below.
For an example, if the above BOOK_STORE
table contains the below four records with the corresponding values for
PUBLISHED_DATE
, the REST API returns the sub categories of [“1926”], which are “08”, “04”, and the sub categories of [“1926”, “08”], which are “09” and “10".
- Record 1 -
PUBLISHED_DATE
:[“1926”, “08”, “09”] - Record 2 -
PUBLISHED_DATE
:[“1926”, “04”, “02”] - Record 3 -
PUBLISHED_DATE
:[“1816”, “09”, “01”] - Record 4 -
PUBLISHED_DATE
:[“1926”, “08”, “10”]
You can retrieve
PUBLISHED_DATE
as a specific category and its sub categories using the Data Explorer as shown below.
You can perform the above search in the Analytics REST API using the following requests. For more information, see Drilling Down Through Categories via REST API.
POST https://localhost:9443/analytics/facets { "tableName" : "BOOK_STORE", "fieldName" : "PUBLISHED_DATE", "categoryPath" : ["1926"], "query" : "timestamp : [1213343534535 TO 465464564644]" }
POST https://localhost:9443/analytics/facets { "tableName" : "BOOK_STORE", "fieldName" : "PUBLISHED_DATE", "categoryPath" : ["1926","08"], "query" : "timestamp : [1213343534535 TO 465464564644]" }
Performing a drill down search
This denotes a hierarchical implementation of a collection of several categories of attributes within one attribute. The values of a set of records, which you can use to classify the records can be indexed as facets. Those fields which are indexed as facets are used to implement faceted search and drill-down.
For an example, in a record which represents a book, you can define the PUBLISHED_DATE
field as a facet when you are persisting the event stream as shown below.
Click the Simulate option of the event stream to simulate sending events to the created even stream.
For the attribute which you defined as a facet, you need to send its values as a JSON string array as shown in the example below .
You can use a facet to filter the books by the published date as a specific category and published year/month/date as its sub categories using the Data Explorer as shown below.
You can perform the above search in the Analytics REST API using the following request. For more information, see Drilling Down Through Categories via REST API .
POST https://localhost:9443/analytics/drillDown { "tableName": "BOOK_STORE", "categories": [ { "fieldName": "PUBLISHED_DATE", "path" : ["1866", "05", "03"] } ], "query" : "timestamp : [1243214324532 TO 4654365223]", "recordStart" : 0, "recordCount" : 100 }
In the above example, PUBLISHED_DATE
is a facet of which values are defined in a three element JSON array. In this example, "05" is a sub-category of “1866”, and “03” is a sub-category of “05”. This information is useful to perform drill down search operations. If you want to retrieve records of which the PUBLISHED_DATE
starts with “1866”, provide only “1866” in a JSON array as the value of the facet in the REST API request. Similarly, if you want to retrieve records of which the PUBLISHED_DATE
is “1866/05/ANY_DAY”, provide [“1866”, “05”] as the value of the facet in the REST API request.
Searching data within a value range
This denotes using facets to filter data based on a value range of an attribute which is defined as a facet.
For an example, in a record which represents a book, you can define the PRICE
field as a facet when you are persisting the event stream as shown below.
Click the Simulate option of the event stream to simulate sending events to the created even stream.
You can perform the above search in the Analytics REST API using the following request. For more information, see Drilling Down Through Categories via REST API.
POST https://localhost:9443/analytics/rangecount { "tableName": "BOOK_STORE", "rangeField" : "PRICE", "ranges" : [ { "label" : "20USD - 30USD", "from" : 20, "to" : 30 }, { "label" : "30USD - 40USD", "from" : 30, "to" : 40 } ], "query" : "*:*" }
Score parameters
Score parameters are used as function parameters of score functions. You can define only INTEGER, DOUBLE, FLOAT or LONG type fields as score parameters. You can define s core parameters when you persist event stream definitions along with indices. Score parameters are used in the implementations of the following REST APIs.
- Drilling Down Through Categories via REST API
- Retrieving the Number of Records Matching the Drill Down Criteria via REST API
Score functions
Score functions are used to override the default score of a record which has facet fields. Default score is 1. The default score is used to retrieve the drill down record count and sub categories of a category. If you override the default value, then the score of that record will be the evaluation of the score function.
For an example, in a record which represents a book you can define PRICE
and DISCOUNT
as score parameters that can be used for the following score function example:
‘price - discount’
. You need to define these fields as score parameters and index columns when persisting the event stream as shown below.
Click the Simulate option of the event stream to simulate sending events to the created even stream.
For an example, consider an event stream with the following two records.
record1 (Book1) : TITLE : Oliver Twist, AUTHOR : C.Dickens, PUBLISHED_DATE :["1866","08,"03"], COUNT : 22, PRICE : 30.00, DISCOUNT : 10.00 record2 (Book2) : TITLE : Great Expectations, AUTHOR : C.Dickens, PUBLISHED_DATE : ["1826","09,"14"], COUNT : 22, PRICE : 50.00, DISCOUNT : 12.00
Score parameters are useful when you want to use the drill down count API to get the sum of the scores of records. If you invoke the API to retrieve the drilldown record count without a score function, then the score of each record is 1 (i.e. the default value). Therefore, the API returns the number of records. You can define a score function as “price - discount
” as shown in the below REST API request.
POST https://localhost:9443/analytics/drillDownCount { "tableName": "BOOKS_STORE", "categories": [ { "fieldName": "AUTHOR", "path" : ["C.Dickens”] } ], “scoreFunction” : “PRICE - DISCOUNT” }
Now, the score of each record is the output of the score function. Therefore, the API returns 58 as the sum of the effective prices after applying the discount.
POST https://localhost:9443/analytics/facets { "tableName" : "BOOK_STORE", "fieldName" : "PUBLISHED_DATE", "categoryPath" : ["1866", "08"], "query" : "timestamp : [1213343534535 TO 465464564644]", "scoreFunction" : "PRICE-DISCOUNT" }
{ "categoryPath" : ["1866", "08"], "categories" : {"23" : 15, "12" : 25, "24" : 42} }