Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note

The below parameter values are just examples. They might not be the optimal values for the specific hardware configurations in your environment. Therefore, carry out load tests on your environment to tune WSO2 BAM accordingly.

Hadoop and Cassandra settings

If you manage a high volume of data with high concurrency, use a distributed WSO2 BAM setup. Performance tuning depends on the data volume handled by the server and server hardware configuration etc. Following are some key recommendations.

Table of Contents
maxLevel3

Tuning receiver nodes

Change the following configurations to tune the receiver nodes.

Configuration fileConfiguration value
<BAM_HOME>/ bin/wso2server.sh file-Xms1024m -Xmx1024m -XX:MaxPermSize=512m

...

/etc/security/limits.conf file

...

  • soft nofile 4096

...

  • hard nofile 65535

 

...

Tuning analyzer nodes

Change the following configurations to tune the analyzer nodes.

Configuration fileConfiguration value
<BAM_HOME>/ bin/wso2server.sh file-Xms1024m -Xmx1024m -XX:MaxPermSize=512m

 

...

Tuning dashboard nodes

Change the following configurations to tune the dashboard nodes.

Configuration fileConfiguration value
<BAM_HOME>/ bin/wso2server.sh file-Xms1024m -Xmx1024m -XX:MaxPermSize=512m

 

There is no much work in Analyzer and Dashboard nodes therefore we don't need much tuning.

 

Hadoop nodes:

Recommended OS: Linux

Storage capacity of each node should have at-least 10GB

Network bandwidth is recommended to have at-least 100 Mbps.

...

Tuning Hadoop nodes

Following are the performance tuning recommendations for tuning Hadoop nodes.

  • operating system: Linux
  • storage capacity of each node: minimum 10 GB
  • network bandwidth: minimum100 Mbps
  • In <BAM_HOME>/repository/conf/log4j.properties file, set the following configuration to suppress Hadoop logs: hadoop.root.logger=ERROR

...

Other optimization depends on their data volume and Hardware configuration. More information for performance tuning in Hadoop cluster can be found here [2]

 

Cassandra nodes:

Make sure your commit log and data dirs (sstables) are on different disks.

  • Info

    To prevent Hadoop job information being printed to console, add the following property to <BAM_HOME>/repository/conf/advanced/hive-site.xml file.

    Code Block
    languagexml
    <property> 
        <name>hive.session.silent</name> 
        <value>true</value> 
      </property> 

Tuning Cassandra nodes

Info

Keep the commit log and data directoriess (sstables) on different disks.

Following are the performance tuning recommendations for tuning Cassandra nodes.

  1. Set the Heap memory as

...

  1. follows.

...

  1.  

    System

...

  1. memoryHeap Size

...

  1. less than 2GB

...

  1. 1/2 of the system memory
    2GB to 4GB

...

  1. 1GB

...

  1. greater than 4GB

...

  1. 1/4 of the system memory, but not more than 8GB

...

  1. Set following

...

  1. configurations in <BAM_HOME>/repository/conf/etc/cassandra.yaml file according to your hardware resources.

...

    • concurrent_reads

...

    • : 4 * no of cores

...

 

...

    • concurrent_writes

...

    • : 8 * no of CPU cores

...

 

...