Storing Index Data
Index data in WSO2 DAS is stored in a local file system. All index data are partitioned into units known as shards. These shards can be viewed in the <DAS_HOME>/repository/data/index_data
directory where there is a sub directory for each available shard.
Configuring shards
Shards that exist in the local file system can be managed by configuring the following parameters in the <DAS_HOME>/repository/conf/analytics/analytics-config.xml
file.
Parameter | Description | Default |
---|---|---|
indexReplicationFactor | The number of index data replicas that should be saved in the system. In a high availability deployment, this at least one replica should be saved. | 1 |
shardCount | The number of index shards that are allowed to exist in the local file system at a given time. The number specified should be higher than the number of indexing nodes in the DAS cluster. The ideal number can be calculated as follows.
| 6 |
shardIndexRecordBatchSize | The amount of index data to be processed by a shard index worker at a given time. This is expressed in bytes. The minimum amount should be 1000 . | 20971520 |
shardIndexWorkerInterval | The time interval during which a shard index processing worker can be inactive while processing operations are taking place, expressed in milliseconds. This parameter, together with the | 1500 |
Allocating shards in a clustered deployment
In a WSO2 DAS cluster, the available shards are equally distributed among all the indexing nodes. e.g., if the cluster has 3 indexing nodes and 6 shards, each indexing node is assigned two shards. When a new indexing node joins a cluster, the existing shard allocations change in order to assign some of the shards to the new node.
If you do not want a new node to operate as an indexing node, you should disable indexing at the time the node is started, using the following setting.
disableIndexing=true
If you want to stop an existing node operating as an indexing node, you should restart it with the same setting. As a result, the existing shard allocation in the indexing cluster changes in order to reallocate the shards of the quitting node to other indexing nodes.
- When you restart an indexing node as a non-indexing node, you should also restart the other indexing servers for them to get the indexing updates of the node that stopped operating as an indexing node.
- If you start an indexing server by mistake, it changes the global configurations. You need to make sure that the shard allocations are correct before proceeding.
When an indexing node is restarted as a non indexing node, the indexing data stored in it is not automatically removed. You can remove it if required from the <DAS-HOME>/repository/data/indexing_data
directory.
Allocating shards manually
Shards can be configured manually in the <DAS_HOME>/repository/conf/analytics/local-shard-allocation-config.conf
file.
There are three modes when configuring the local shard allocations of a node.
Mode | Description |
---|---|
NORMAL | The indexing data for a shard is stored in the node to which the shard is assigned. |
INIT | If you restart the server after adding a shard in the e.g., If the existing shard allocations are as follows, and you add the line 1, NORMAL 2, NORMAL |
RESTORE | This mode allows you to copy index data to a local node in order to let that node use it. e.g., If you copy index data for shard 5, add the line 1, NORMAL 2, NORMAL |
Related Links
- Configuring Data Persistence - For an introduction about persisting data (both records and index data) in WSO2 DAS.
- Configuring Indexes - For detailed information about indexing in WSO2 DAS.