...
Indexing related configurations are done in the <DAS_HOME>/repository/conf/analytics/analytics-config.xml
file file. Additionally, information relating to shards is maintained in the <DAS_HOME>/repository/conf/analytics/local-shard-allocation-config.conf
file. This file stores the shard number along with its state (that can be INIT
or NORMA
L). The INIT
is the initial state. Usually this state cannot be seen from outside. This is because the INIT
state changes to the NORMAL
state once the server starts. If the indexing node is running, the state of shards should be NORMAL
and not INIT.
The NORMAL
state denotes that the indexing node has started indexing. Therefore, whenever the data is ready to be indexed, the indexer node indexes the incoming data.
...
Debugging Indexing in a cluster
If someone thinks there is seems to be an issue with indexing, first thing to do is, check the local-shard-allocation.config.conf and see if the shards are allocated properly. The number of shards and the number of replicas are mentioned in analytics-config.xmldo the following in the given order.
In the
<DAS_HOME>/repository/conf/analytics/analytics-config.xml
file check whether the shards are properly allocated.Code Block language xml <!-- The number of index data replicas the system should keep, for H/A, this should be at least 1, e.g. the value 0 means there aren't any copies of the data --> <indexReplicationFactor>1</indexReplicationFactor> <!-- The number of index shards, should be equal or higher to the number of indexing nodes that is going to be working, ideal count being 'number of indexing nodes * [CPU cores used for indexing per node]' --> <shardCount>6</shardCount>
...
Check the logs that are printed when the cluster is
...
initilized. The log must be similar to the example shown below.
Code Block language powershell TID: [-1] [] [2017-05-12 13:17:27,538] INFO {org.wso2.carbon.analytics.dataservice.core.indexing.IndexNodeCoordinator} - Indexing Initialized: CLUSTERED {0={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 1={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 2={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 3={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 4={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 5={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}} | Current Node Indexing: Yes {org.wso2.carbon.analytics.dataservice.core.indexing.IndexNodeCoordinator}
Here,
{0={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.36.241.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.36.241.67]:4000}
means that
...
the shard 0 is allocated to
...
two nodes, and their
...
IDs are
c13d3a23-b15a-4b9c-ac8f-a30df2811c98
andbc751d36-d345-4b8f-b133-b77793f04805
.
...
The IPs of the two nodes are also mentioned in the log line.
...
For a correctly configured
...
two-node DAS cluster, this log line
...
must contain all
...
six shards (from 0 to 5) and
...
the node IDs to which they are allocated.
Info This log line does not contain both the node IDs when you initially set up the cluster and start the first server
...
because the other node has not joined the cluster
...
. The log line
...
is printed with the complete
...
mapping of shards to node IDs only after the 2nd node joins the cluster.
The two node
...
IDs mentioned in the log line
...
must match
...
the
...
IDs mentioned in
...
the
my-node-id.dat
...
of both DAS nodes.
...
If there are more than
...
two unique IDs in a 2 node cluster, the shard allocation may have been affectedand it mayneed to be corrected.
One reason for containing more than 2 ids is that, reconfiguring the cluster with new DAS packs without clearing the data sources in analytics-datasources.xml. When a new DAS pack is started pointing to the same analytics data sources, a new id is generated and the new DAS pack will be considered the third node who joined the cluster. When we are re-configuring the cluster with new DAS packs, we need to make sure that we first backup the older ids and restore them in the new packs. If the two node cluster already contain more than 2 unique ids ( due to starting a new DAS pack with new id), we have to remove the unnecessary/additional ids. Steps are as below.
...