This site contains the documentation that is relevant to older WSO2 product versions and offerings.
For the latest WSO2 documentation, visit https://wso2.com/documentation/.
Configuring Cassandra Cluster
Data that comes to BAM through data receivers is usually stored in the default Cassandra database. The image above shows how the Cassandra databases of all two BAM nodes are deployed in a cluster. This ensures that even if one node fails, data can be received and stored in other databases in the cluster, and also ensures high availability of data to run the Hive scripts on.
Information to know before you start
- Increase the heap memory size of BAM nodes to at least 2 GB and sync times in all nodes.
- BAM 2.4.0 uses Cassandra version 1.1.3 while BAM 2.4.1 uses Cassandra version 1.2.13.
- The fully-distributed BAM setup uses node 3, 4 and 5, which is why this topic includes configurations for node 3, 4 and 5, so you must change the configurations accordingly if you are using different setup.
- You can start the BAM server using the Cassandra profile, thus BAM can act as Cassandra in your cluster. See Running the Product on a Preferred Profile for more information on how to do this.
- For instructions on using external Cassandra with WSO2 BAM, see Connecting to External Cassandra.
Add the following configurations to
<
BAM_HOME>/repository/conf/etc/cassandra.yaml
file in the nodes mentioned below.To node3:
cluster_name: Test Cluster initial_token: 0 seed_provider: - seeds: "node3,node4,node5" listen_address: node3 rpc_address: node3 rpc_port: 9160
For Cassandra 1.2.13 (in BAM 2.4.1) the
initial_token
value cannot be 0. You must enter the value generated by the script.to node4:
cluster_name: Test Cluster initial_token: 56713727820156410577229101238628035242 seed_provider: - seeds: "node3,node4,node5" listen_address: node4 rpc_address: node4 rpc_port: 9160
to node5:
cluster_name: Test Cluster initial_token: 113427455640312821154458202477256070485 seed_provider: - seeds: "node3,node4,node5" listen_address: node5 rpc_address: node5 rpc_port: 9160
Connect the nodes to Cassandra endpoints.
Edit the
<
BAM_HOME>/repository/conf/advanced/streamdefn.xml
file in all nodes as follows. This changes replication factor and read/write consistency levels using which data receivers write data to Cassandra. For example, if you have four Cassandra nodes in the cluster, enter 3 as the value for the<ReplicationFactor>
property.<StreamDefinition> <ReplicationFactor>3</ReplicationFactor> <ReadConsistencyLevel>QUORUM</ReadConsistencyLevel> <WriteConsistencyLevel>ONE</WriteConsistencyLevel> <StrategyClass>org.apache.cassandra.locator.SimpleStrategy</StrategyClass> </StreamDefinition>
Configure the datasources. A set of JDBC URLs must be added as a comma separated list when load balancing is required.
Optionally in order to view the cluster information in the Cassandra Keyspaces List UI, add a file named cassandra-endpoint.xml in
<BAM_HOME>/repository/conf/etc
with following configuration. The cassandra-endpoint.xml file is required when deploying the backend Cassandra cluster in a IaaS like AWS. IaaS may not provide real IPs, hence it is necessary to use this configuration file to list the mapped real IPs.<Cassandra> <EndPoints> <EndPoint><HostName>name_of_machine1(BAM N1)</HostName></EndPoint> <EndPoint><HostName>name_of_machine2(BAM N2)</HostName></EndPoint> </EndPoints> </Cassandra>
When configuring an external Cassandra cluster, you must additionally enable clustering in the
<BAM_HOME>/repository/conf/axis2/axis2.xml
file.<clustering class="org.wso2.carbon.core.clustering.hazelcast.HazelcastClusteringAgent" enable="true">
After starting the Cassandra cluster, you can verify the status of the cluster using a NodeTool command. For example, the below command is used to access the Cassandra keyspaces via NodeTool. (Port 9999 is the JMX port.)
./nodetool -u admin -pw admin -h localhost -p 9999 cfstats
You can connect to the Cassandra cluster using the Cassandra CLI tool. For example, the following commands are used to access the
EVENT_KS
Cassandra keyspace using Cassandra CLI../cassandra-cli -h localhost -pw admin -u admin show keyspaces use EVENT_KS; show schema EVENT_KS;
When configuring the Cassandra cluster in this setup, you need to do the following for the Cassandra keyspaces feature to function and list the Cassandra keyspaces in the Main menu of the WSO2 BAM maangement console.
If you are using internal Cassandra, which is shipped with WSO2 BAM, both BAM nodes and Cassandra nodes should be in the same clustering domain.
If you are using external Cassandra, to change the following configuration in the
<BAM_HOME>/repository/conf/etc/cassandra.yaml
file to use theAllowAllAuthenticator
.authenticator:AllowAllAuthenticator