Creating the Table Using Carbon Analytics as the Provider

Use the following query to create a table in the Spark environment (if it does not already exist), using data from Carbon analytics. Carbon analytics refer to either the built-in H2 database or any external database that is connected to the DAL.

CREATE TEMPORARY TABLE plugUsage 
USING CarbonAnalytics 
OPTIONS (tableName "plug_usage", 
         schema "house_id INT, household_id INT, plug_id INT, usage FLOAT -sp, composite FACET -i",
		 primaryKeys "household_id, plug_id"
        );

Carbon analytics relation provider options

The options that can be used with the Carbon analytics relation provider are described below.

Specify the options in key value pairs separated by commas, and give the values within quotation marks.

Option	Description	Example
`tableName` or `streamName`	Name of the table in the DAL.	`tableName "plug_usage"` or streamName "plug.usage"
schema	Schema of the table in the DAL. This is mandatory. You must specify the schema for the table. This was not required in DAS 3.0.1 but required in DAS 3.1.0 upwards. Schema fields are column name and column type value pairs with indexing options. These fields should be comma separated. Following are the schema indexing options. `-i` denotes an indexed column. `-sp` denotes an indexed column with score param. This column should be of numeric type. The following fields are special fields in an analytics table: _timestamp denotes the timestamp of the record when it is persisted _tenantId denotes the tenant id of the record	schema "house_id INT, household_id INT, plug_id INT, usage FLOAT -sp, composite FACET -i"
`primaryKeys`	Primary key of the table in the DAL. This is optional. Assign primary keys if and only if you have provided a schema.	primaryKeys "household_id, plug_id"
`mergeSchema`	A boolean flag used for schema merging. If this option is set to `true`, the given schema is merged with the corresponding table schema in the Data Access Layer (if a schema exists). If the option is set to `false`, the given schema overwrites the table schema in the Data Access Layer.	`mergeSchema "false"`
`recordStore`	The Analytics Record Store in which this table is created. The default Analytics Record Store used by CarbonAnalytics is the `PROCESSED_DATA_STORE`.	`recordStore "EVENT_STORE"`
`globalTenantAccess`	A boolean value which represents a flag which says, if "true", read data records from all the tenants with the given table name, and also write to the same table, where by looking at the incoming records' "_tenantId" value, it will route the records to the specified tenant's table. A user can use this flag, and filter/groupbytenants from Spark queries. When creating a table with this option enabled, in the schema, the field "_tenantId" should be added with the type INTEGER, in order to read/write tenant data.	`globalTenantAccess "true"`
`incrementalParams`	This is a set of parameters that govern the incremental data processing (time based). The parameters are as follows. Incremental Table ID: This is an alias ID for the table to store incremental data. Incremental Table Unit: This parameter sets the start timestamp of the data to be processed. Possible values are `SECOND`, `MINUTE`, `HOUR`, `DAY`, `MONTH` and `YEAR`. Incremental Table Unit Offset: This parameter specifies the number of incremental table units to be processed from the last processed event. The default value is 1. The value should always be greater than 0.	`incrementalParams “pUsage ,MINUTE”` e.g., If the last processed row has the timestamp `15:10:21.555 of 1/1/2016`, the processing starts from `15:10:00.000` of `1/1/2016`. `incrementalParams “pUsage ,MINUTE, 5”` e.g., If the last processed row has the timestamp `15:10:21.555` of `1/1/2016`, then the processing starts from `15:05:00.000` of `1/1/2016`.