Facets and score parameters are used to categorize data based on attributes of an event stream in WSO2 DAS.
Facets
A facet is an attribute of indexed records which is used to classify the records by the attribute value. Facet attributes allow you to carry out a faceted extended search within the defined categories. There is no data type called FACET in event streams. Any STRING type field, of which you can define the attribute value as a JSON array can be indexed as a facet. Facets are defined when the table schema is created during the persisting of event streams.
Facets are used in the implementations of the following REST APIs.
Facet usage types
Facets are used in the Analytics REST API and Data Explorer in WSO2 DAS. Different usage types of facets are described below.
Localtabgroup |
---|
Localtab |
---|
title | Searching by an attribute |
---|
| Searching by an attribute Anchor |
---|
| Search by Attribute |
---|
| Search by Attribute |
---|
|
This denotes implementing an attribute of a set of records as a facet. For an example, in a record which represents a book, you can define the AUTHOR field as a facet when you are persisting the event stream as shown below.
Click the Simulate option of the event stream to simulate sending events to the created event stream.
For the attribute which you defined as a facet, you need to send its values as a JSON string array as shown in the example below .
You can use this facet to retrieve the records of the books which are written by a particular author using the Data Explorer as shown in the example below.
You can perform the above search in the Analytics REST API using the following request. For more information, see Drilling Down Through Categories via REST API. Code Block |
---|
| POST https://localhost:9443/analytics/drillDown
{
"tableName": "BOOK_STORE",
"categories": [
{
"fieldName": "AUTHOR",
"path" : ["C.Dickens"]
}
],
"query" : "timestamp : [1243214324532 TO 4654365223]",
"recordStart" : 0,
"recordCount" : 100
} |
|
Localtab |
---|
title | Extracting sub categories |
---|
| Another use of facets is to extract the sub categories of a category. You can retrieve the immediate sub-categories of a given category (which are represented in a JSON array), using the relevant API. The API returns the immediate subcategories of the given category in the corresponding table. For an example, in a record which represents a book, you can define the PUBLISHED_DATE field as a facet when you are persisting the event stream as shown below. Click the Simulate option of the event stream to simulate sending events to the created even stream. Send its values as a JSON string array for the attribute which you defined as a facet as shown in the example below.
e.g., If the above BOOK_STORE table contains the below four records with the corresponding values for PUBLISHED_DATE attribute, the REST API returns the sub categories of ['1926'] , which are '08' , '04' , and the sub categories of ['1926', '08'] , which are '09' and '10' . - Record 1 - PUBLISHED_DATE:
['1926', '08', '09'] - Record 2 - PUBLISHED_DATE:
['1926', '04', '02'] - Record 3 - PUBLISHED_DATE:
['1816', '09', '01'] - Record 4 - PUBLISHED_DATE:
['1926', '08', '10']
You can retrieve PUBLISHED_DATE as a specific category and its sub categories using the Data Explorer as shown below.
You can perform the above search in the Analytics REST API using the following requests. For more information, see Drilling Down Through Categories via REST API. Code Block |
---|
| POST https://localhost:9443/analytics/facets
{
"tableName" : "BOOK_STORE",
"fieldName" : "PUBLISHED_DATE",
"categoryPath" : ["1926"],
"query" : "timestamp : [1213343534535 TO 465464564644]"
} |
Code Block |
---|
| POST https://localhost:9443/analytics/facets
{
"tableName" : "BOOK_STORE",
"fieldName" : "PUBLISHED_DATE",
"categoryPath" : ["1926","08"],
"query" : "timestamp : [1213343534535 TO 465464564644]"
} |
|
Localtab |
---|
title | Performing a drill down search |
---|
| This denotes a hierarchical implementation of a collection of several categories of attributes within one attribute. The values of a set of records, which you can use to classify the records can be indexed as facets. Those fields which are indexed as facets are used to implement faceted search and drill-down. For an example, in a record which represents a book, you can define the PUBLISHED_DATE field as a facet when you are persisting the event stream as shown below.
Click the Simulate option of the event stream to simulate sending events to the created even stream.
For the attribute which you defined as a facet, you need to send its values as a JSON string array as shown in the example below .
You can use a facet to filter the books by the published date as a specific category and published year/month/date as its sub categories using the Data Explorer as shown below. You can perform the above search in the Analytics REST API using the following request. It will return the records which match the drill down search. For more information, see Retrieving Specific Records through a Drill Down Search via REST API. Code Block |
---|
| POST https://localhost:9443/analytics/drillDown
{
"tableName": "BOOK_STORE",
"categories": [
{
"fieldName": "PUBLISHED_DATE",
"path" : ["1866", "05", "03"]
}
],
"query" : "timestamp : [1243214324532 TO 4654365223]",
"recordStart" : 0,
"recordCount" : 100
} |
Note |
---|
In the above example, PUBLISHED_DATE is a facet of which values are defined in a three element JSON array. In this example, "05" is a sub-category of “1866”, and “03” is a sub-category of “05”. This information is useful to perform drill down search operations. If you want to retrieve records of which the PUBLISHED_DATE starts with “1866”, provide only “1866” in a JSON array as the value of the facet in the REST API request. Similarly, if you want to retrieve records of which the PUBLISHED_DATE is “1866/05/ANY_DAY”, provide [“1866”, “05”] as the value of the facet in the REST API request. |
Also you can perform the above search in the Analytics REST API using the following request. It will return the number of records which match the drill down search. For more information, see Retrieving the Number of Records Matching the Drill Down Criteria via REST API. Code Block |
---|
| POST https://localhost:9443/analytics/drillDownCount
{
"tableName": "BOOK_STORE",
"categories": [
{
"fieldName": "PUBLISHED_DATE",
"path" : ["1866", "05", "03"]
}
],
"query" : "timestamp : [1243214324532 TO 4654365223]
"recordStart" : 0,
"recordCount" : 100
} |
|
Localtab |
---|
title | Searching data within a value range |
---|
| Searching data within a value rangeThis denotes using facets to filter data based on a value range of an attribute which is defined as a facet. For an example, in a record which represents a book, you can define the PRICE field as a facet when you are persisting the event stream as shown below. Tip |
---|
Define the field based on which you want to search data as numeric (INTEGER, FLOAT etc.), and as an Index Column. |
Click the Simulate option of the event stream to simulate sending events to the created even stream.
You can perform the above search in the Analytics REST API using the following request. For more information, see Retrieving the Event Count of Range Facets. Note |
---|
in this WSO2 DAS version, the Data Explorer does not support performing search operations on range facets. |
Code Block |
---|
| POST https://localhost:9443/analytics/rangecount
{
"tableName": "BOOK_STORE",
"rangeField" : "PRICE",
"ranges" : [
{
"label" : "20USD - 30USD",
"from" : 20,
"to" : 30
},
{
"label" : "30USD - 40USD",
"from" : 30,
"to" : 40
}
],
"query" : "*:*"
} |
|
|
Score parameters
Score parameters are used as function parameters of score functions. You can define only INTEGER, DOUBLE, FLOAT or LONG type fields as score parameters. You can define score parameters when you persist event stream definitions along with indices. Score parameters are used in the implementations of the following REST APIs.
Score functions
Score functions are used to override the default score of a record which has facet fields. Default score is 1. The default score is used to retrieve the drill down record count and sub categories of a category. If you override the default value, then the score of that record will be the evaluation of the score function.
Retrieving the record score matching a drill down search
For an example, in a record which represents a book you can define PRICE
and DISCOUNT
as score parameters that can be used for the following score function example: ‘price - discount’
. You need to define these fields as score parameters and index columns when persisting the event stream as shown below.
Click the Simulate option of the event stream to simulate sending events to the created even stream.
For an example, consider an event stream with the following two records.
Code Block |
---|
|
record1 (Book1) :
TITLE : Oliver Twist,
AUTHOR : C.Dickens,
PUBLISHED_DATE :["1866","08,"03"],
COUNT : 22,
PRICE : 30.00,
DISCOUNT : 10.00
record2 (Book2) :
TITLE : Great Expectations,
AUTHOR : C.Dickens,
PUBLISHED_DATE : ["1826","09,"14"],
COUNT : 22,
PRICE : 50.00,
DISCOUNT : 12.00 |
Score parameters are useful when you want to use the drill down count API to get the sum of the scores of records. If you invoke the API to retrieve the drilldown record count without a score function, then the score of each record is 1 (i.e. the default value). Therefore, the API returns the number of records. You can define a score function as “price - discount
” as shown in the below REST API request. For more information, see Drilling Down Through Categories via REST API.
Code Block |
---|
|
POST https://localhost:9443/analytics/drillDownCount
{
"tableName": "BOOKS_STORE",
"categories": [
{
"fieldName": "AUTHOR",
"path" : ["C.Dickens”]
}
],
“scoreFunction” : “PRICE - DISCOUNT”
} |
Now, the score of each record is the output of the score function. Therefore, the API returns 58 as the sum of the effective prices after applying the discount.
Retrieving the record score based on specific categories
You can use the REST API to retrieve the score of each record (after applying the score function) based on specific categories as shown below. For more information, see Retrieving the Number of Records Matching the Drill Down Criteria via REST API.
Code Block |
---|
|
POST https://localhost:9443/analytics/facets
{
"tableName" : "BOOK_STORE",
"fieldName" : "PUBLISHED_DATE",
"categoryPath" : ["1866", "08"],
"query" : "timestamp : [1213343534535 TO 465464564644]",
"scoreFunction" : "PRICE-DISCOUNT"
} |
The sample out put of the above request is as follows. It denotes the following.
- Output of the score function of the records with the
PUBLISHED_DATE
as ["1866", "08", "23"] is 15. - Output of the score function of the records with the
PUBLISHED_DATE
as ["1866", "08", "12"] is 25.
Code Block |
---|
|
{
"categoryPath" : ["1866", "08"],
"categories" : {"23" : 15, "12" : 25}
} |