Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Indexing related configurations are done in the <DAS_HOME>/repository/conf/analytics/analytics-config.xmlfile file. Additionally, information relating to shards is maintained in the <DAS_HOME>/repository/conf/analytics/local-shard-allocation-config.conf file. This file stores the shard number along with its state (that can be INIT or NORMAL). The INIT is the initial state. Usually this state cannot be seen from outside. This is because the INIT state changes to the NORMAL state once the server starts. If the indexing node is running, the state of shards should be NORMAL and not INIT. The NORMAL state denotes that the indexing node has started indexing. Therefore, whenever the data is ready to be indexed, the indexer node indexes the incoming data.

...

Debugging Indexing in a cluster

If someone thinks there is seems to be an issue with indexing, first thing to do is, check the local-shard-allocation.config.conf and see if the shards are allocated properly. The number of shards and the number of replicas are mentioned in analytics-config.xmldo the following in the given order. 

  1. In the <DAS_HOME>/repository/conf/analytics/analytics-config.xml file check whether the shards are properly allocated.

    Code Block
    languagexml
    <!-- The number of index data replicas the system should keep, for H/A, this should be at least 1, e.g. the value 0 means
            there aren't any copies of the data -->
       <indexReplicationFactor>1</indexReplicationFactor>
       <!-- The number of index shards, should be equal or higher to the number of indexing nodes that is going to be working,
            ideal count being 'number of indexing nodes * [CPU cores used for indexing per node]' -->
       <shardCount>6</shardCount>

...

  1. Check the logs that are printed when the cluster is

...

  1. initilized. The log must be similar to the example shown below.

    Code Block
    languagepowershell
    TID: [-1] [] [2017-05-12 13:17:27,538]  INFO {org.wso2.carbon.analytics.dataservice.core.indexing.IndexNodeCoordinator} -  Indexing Initialized: CLUSTERED {0={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 1={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 2={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 3={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 4={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 5={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}} | Current Node Indexing: Yes {org.wso2.carbon.analytics.dataservice.core.indexing.IndexNodeCoordinator}

    Here{0={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.36.241.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.36.241.67]:4000} means that

...

  1. the shard 0 is allocated to

...

  1. two nodes, and their

...

  1. IDs are c13d3a23-b15a-4b9c-ac8f-a30df2811c98 and bc751d36-d345-4b8f-b133-b77793f04805.

...

  1. The IPs of the two nodes are also mentioned in the log line.

...

  1. For a correctly configured

...

  1. two-node DAS cluster, this log line

...

  1. must contain all

...

  1. six shards (from 0 to 5) and

...

  1. the node IDs to which they are allocated.

    Info

    This log line does not contain both the node IDs when you initially set up the cluster and start the first server

...

  1. because the other node has not joined the cluster

...

  1. . The log line

...

  1. is printed with the complete

...

  1. mapping of shards to node IDs only after the 2nd node joins the cluster.

    The two node

...

  1. IDs mentioned in the log line

...

  1. must match

...

  1. the

...

  1. IDs mentioned in

...

  1. the my-node-id.dat

...

  1.  of both DAS nodes.

...

  1. If there are more than

...

  1. two unique IDs in a 2 node cluster, the shard allocation may have been affectedand it mayneed to be corrected.

 One reason for containing more than 2 ids is that, reconfiguring the cluster with new DAS packs without clearing the data sources in analytics-datasources.xml. When a new DAS pack is started pointing to the same analytics data sources, a new id is generated and the new DAS pack will be considered the third node who joined the cluster. When we are re-configuring the cluster with new DAS packs, we need to make sure that we first backup the older ids and restore them in the new packs. If the two node cluster already contain more than 2 unique ids ( due to starting a new DAS pack with new id), we have to remove the unnecessary/additional ids. Steps are as below.

...