Page Comparison

...

In the <DAS_HOME>/repository/conf/analytics/analytics-config.xml file check whether the shards are properly allocated.

Code Block

language	xml

<!-- The number of index data replicas the system should keep, for H/A, this should be at least 1, e.g. the value 0 means
        there aren't any copies of the data -->
   <indexReplicationFactor>1</indexReplicationFactor>
   <!-- The number of index shards, should be equal or higher to the number of indexing nodes that is going to be working,
        ideal count being 'number of indexing nodes * [CPU cores used for indexing per node]' -->
   <shardCount>6</shardCount>

Check the logs that are printed when the cluster is initilized. The log must be similar to the example shown below.

Code Block

language	powershell

TID: [-1] [] [2017-05-12 13:17:27,538]  INFO {org.wso2.carbon.analytics.dataservice.core.indexing.IndexNodeCoordinator} -  Indexing Initialized: CLUSTERED {0={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 1={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 2={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 3={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 4={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}, 5={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.3.24.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.3.24.67]:4000}} | Current Node Indexing: Yes {org.wso2.carbon.analytics.dataservice.core.indexing.IndexNodeCoordinator}

Here, {0={c13d3a23-b15a-4b9c-ac8f-a30df2811c98=Member [10.36.241.70]:4000 this, bc751d36-d345-4b8f-b133-b77793f04805=Member [10.36.241.67]:4000} means that the shard 0 is allocated to two nodes, and their IDs are c13d3a23-b15a-4b9c-ac8f-a30df2811c98 and bc751d36-d345-4b8f-b133-b77793f04805. The IPs of the two nodes are also mentioned in the log line. For a correctly configured two-node DAS cluster, this log line must contain all six shards (from 0 to 5) and the node IDs to which they are allocated.

Info
This log line does not contain both the node IDs when you initially set up the cluster and start the first server because the other node has not joined the cluster. The log line is printed with the complete mapping of shards to node IDs only after the 2nd node joins the cluster.

The two node IDs mentioned in the log line must match the IDs mentioned in the my-node-id.dat of both DAS nodes. If there are more than two unique IDs in a 2 node cluster, the shard allocation may have been affectedand affected and it mayneed to be corrected.

...

One reason

...

to have more than

...

one node ID is to allow the cluster to be reconfigured with new DAS packs without clearing the data sources configured in the analytics-datasources.xml file.

...

When a new DAS pack

...

that points to the same analytics data sources is started, a new

...

ID is generated and the new DAS pack

...

is considered the third node

...

to join the cluster. When

...

you re-

...

configure the cluster with new DAS packs,

...

make sure

...

the older

...

IDs are first backed up and restored in the new packs. For more information, see Backing up and Restoring Analytics Data. If the two node cluster already

...

has more than

...

two unique

...

IDs ( a result of starting a new DAS pack with a new

...

ID),

...

remove the unnecessary/additional

...

IDs by following the steps given below.
1. Keep backup of the current my-node-id.dat of both nodes
.Extract the node ids mentioned in the log line mentioned above. This may contain more than 2 ids
1. .
2. Separate

out

1. the

ids which

1. IDs that do not match with

the

1. the my-node-id.dat of

both

1. the two nodes.
Shutdown
1. Shut down both the nodes.
Get
1. Replace the my-node-id.dat of either DAS node with one of the non-matching/unnecessary/additional node
ids and replace the my-node-id.dat of any DAS node. (make sure
1. IDs.
  Info
  Make sure that you keep a backup of the my-node-id.dat before doing this
)
1. .
2. Start the DAS node mentioned in
step 5 with property “
1. substep d with the -DdisableIndexing=
true”
1. true property. This
command will remove
1. property removes the node
id
1. ID that
we put
1. you enter in the my-node-id.dat from the indexing cluster.
2. Repeat
the steps 4, 5
1. substeps c, d, and
6
1. e for all the additional node
ids
1. IDs.
Finally, restore
1. Restore the two node
ids (restore the my-node-id.dat files that
1. IDs you backed up
)
1. in substep d.

Clean the

1. Clear the information in the in the <DAS_HOME>/repository/data

folder

Go to
1. directory.
2. Open the <DAS_HOME>/repository/conf/analytics/local-shard-allocation
.
1. -config.conf file and
replace
1. update the content
with following.
1. as follows:
  Info
  This must be done for both the nodes.
  Code Block
  0,INIT
1. 1,INIT
1. 2,INIT
1. 3,INIT
1. 4,INIT
1. 5,INIT
Do
1. Start both the
above for both
1. nodes
and start. You will see that the
1. , and cheeck the content of the the <DAS_HOME>/repository/conf/analytics/local-shard-allocation-config.conf
content is changed to
1. file. It should be changed as follows.
  Code Block
  0,NORMAL
1. 1,NORMAL
1. 2,NORMAL
1. 3,NORMAL
1. 4,NORMAL
1. 5,NORMAL
Note that the above configuration is valid a DAS configured with 6 shards and 1 replicas. Step 10 should be changed according to
1. Info
  The nodes in this example are configured with six shards and one replicas for each shard. This step must change based on the number of shards and replicas
you have
1. configured.

Indexing in Spark Environment

...

Versions Compared

Old Version 9

New Version 10

Key

Indexing in Spark Environment