Creating the Table Using Carbon Analytics as the Provider
Use the following query to create a table in the Spark environment (if it does not already exist), using data from Carbon analytics. Carbon analytics refer to either the built-in H2 database or any external database that is connected to the DAL.
CREATE TEMPORARY TABLE plugUsage USING CarbonAnalytics OPTIONS (tableName "plug_usage", schema "house_id INT, household_id INT, plug_id INT, usage FLOAT -sp, composite FACET -i", primaryKeys "household_id, plug_id" );
Carbon analytics relation provider options
The options that can be used with the Carbon analytics relation provider are described below.
Specify the options in key value pairs separated by commas, and give the values within quotation marks.
Option | Description | Example |
---|---|---|
| Name of the table in the DAL. |
streamName "plug.usage" |
schema | Schema of the table in the DAL. This is mandatory. You must specify the schema for the table. This was not required in DAS 3.0.1 but required in DAS 3.1.0 upwards. Schema fields are column name and column type value pairs with indexing options. These fields should be comma separated. Following are the schema indexing options.
The following fields are special fields in an analytics table:
| schema "house_id INT, household_id INT, plug_id INT, usage FLOAT -sp, composite FACET -i" |
primaryKeys | Primary key of the table in the DAL. This is optional. Assign primary keys if and only if you have provided a schema. | primaryKeys "household_id, plug_id" |
mergeSchema | A boolean flag used for schema merging. If this option is set to true , the given schema is merged with the corresponding table schema in the Data Access Layer (if a schema exists). If the option is set to false , the given schema overwrites the table schema in the Data Access Layer. | mergeSchema "false" |
recordStore | The Analytics Record Store in which this table is created. The default Analytics Record Store used by CarbonAnalytics is the | recordStore "EVENT_STORE" |
globalTenantAccess | A boolean value which represents a flag which says, if "true", read data records from all the tenants with the given table name, and also write to the same table, where by looking at the incoming records' "_tenantId" value, it will route the records to the specified tenant's table. A user can use this flag, and filter/groupbytenants from Spark queries. When creating a table with this option enabled, in the schema, the field "_tenantId" should be added with the type INTEGER, in order to read/write tenant data. | globalTenantAccess "true" |
incrementalParams | This is a set of parameters that govern the incremental data processing (time based). The parameters are as follows.
|
|