com.atlassian.confluence.content.render.xhtml.migration.exceptions.UnknownMacroMigrationException: The macro 'next_previous_link3' is unknown.

Configuring HA Using Pacemaker and Heartbeat

The following sections cover the information required to configure high availability (HA) for PPaaS using Pacemaker/Heartbeat:

What is Pacemaker?

Pacemaker is a cluster resource manager (CRM). It achieves maximum availability for your cluster services (resources) by detecting and recovering node and resource-level failures using the messaging and membership capabilities provided by your preferred cluster infrastructure (Corosync or Heartbeat).

Refer Pacemaker documentation for more information.

What is Heartbeat?

Heartbeat is a daemon that provides cluster infrastructure (communication and membership) services to its clients. This allows clients to know about the presence (or disappearance!) of peer processes on other machines and to easily exchange messages with them.

In order to be useful to users, the Heartbeat daemon needs to be combined with a CRM, which has the task of starting and stopping the services (IP addresses, web servers, etc.) making clusters highly available. Pacemaker is the preferred cluster resource manager for clusters based on Heartbeat.

Prerequisites

  • Two physical or virtual hosts running Ubuntu 12.04 64 bit OS.
  • Pacemaker 1.1.6
  • Heartbeat 3.0.5

Configuring Pacemaker/Heartbeat for PPaaS

  1. SSH into the above VM instance and install Pacemaker and Heartbeat:

    apt-get install pacemaker heartbeat
  2. Switch to root user:

    sudo su
  3. Create the Heartbeat configuration file at the following location: /etc/ha.d/ha.cf

    enable pacemaker, without stonith
    crm             yes
    # define log file
    logfile /var/log/ha-log
    # warning of soon be dead
    warntime        10
    # declare a host (the other node) dead after:
    deadtime        20
    # dead time on boot (could take some time until net is up)
    initdead        120
    # time between heartbeats
    keepalive       2
    # the nodes
    node node1 # set node1 hostname
    node node2 # set node2 hostname
    # heartbeats, over dedicated replication interface
    ucast           eth1 10.186.175.16 # set node1 network-interface and ip address
    ucast           eth1 54.211.110.217 # set node2 network-interface and ip address

     

  4. Create the authentication key file and set permissions in one of the hosts:

    ( echo -ne "auth 1\n1 sha1 "; \
    dd if=/dev/urandom bs=512 count=1 | openssl md5 ) \
    > /etc/ha.d/authkeys
    
    chmod 0600 /etc/ha.d/authkeys
  5. Copy the above authkeys file to each host located at /etc/ha.d/authkeys.
  6. Restart Hearbeat service:

    service heartbeat restart
  7. Check the status of the Pacemaker cluster using CRM:

    All nodes in the cluster should be in the online state. Recheck the heartbeat configuration if a cluster is in the offline state.

    crm status
    ============
    Last updated: Wed Oct 15 11:25:05 2014
    Last change: Wed Oct 15 11:21:51 2014 via crmd on ip-10-186-175-16
    Stack: Heartbeat
    Current DC: ip-10-186-175-16 (d16ccc5c-2641-42b6-b46a-57a0b32fddc9) - partition with quorum
    Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
    2 Nodes configured, unknown expected votes
    0 Resources configured.
    ============
    Online: [ ip-10-186-175-16 ip-10-153-165-178 ]
  8. Disable STONITH:

    crm configure property stonith-enabled=false
  9. Create a Failover IP resource to manage the virtual IP address:

    crm configure primitive FAILOVER-IP ocf:heartbeat:IPaddr params ip=192.168.10.20 cidr_netmask="255.255.255.0" op monitor interval=10s
  10. Secure copy (SCP) java and PPaaS packages to each host and extract them under folder /opt .

  11. Create an init.d script for PPaaS with the following:
    Update the values of USER, JAVA_HOME and PRODUCT_HOME variables.

    #!/bin/sh
    ### BEGIN INIT INFO
    # Provides:          ppaas
    # Required-Start:    $local_fs $remote_fs $network $syslog $named
    # Required-Stop:     $local_fs $remote_fs $network $syslog $named
    # Default-Start:     2 3 4 5
    # Default-Stop:      0 1 6
    # X-Interactive:     true
    # Short-Description: Start/stop ppaas server
    ### END INIT INFO
    
    USER="vagrant"
    PRODUCT_NAME="ppaas"
    JAVA_HOME="/opt/jdk1.7.0_60"
    PRODUCT_HOME="/opt/ppaas_4.1.0"
    PID_FILE="${PRODUCT_HOME}/wso2carbon.pid"
    CMD="${PRODUCT_HOME}/bin/wso2server.sh"
    
    # LSB exit codes:
    # ftp://ftp.nomadlinux.com/nomad-2/dist/heartbeat-1.2.5/include/clplumbing/lsb_exitcodes.h
    
    LSB_EXIT_OK=0
    LSB_EXIT_GENERIC=1
    LSB_EXIT_EINVAL=2
    LSB_EXIT_ENOTSUPPORTED=3
    LSB_EXIT_EPERM=4
    LSB_EXIT_NOTINSTALLED=5
    LSB_EXIT_NOTCONFIGED=6
    LSB_EXIT_NOTRUNNING=7
    
    is_service_running() {
    	if [ -e ${PID_FILE} ]; then
    		PID=`cat ${PID_FILE}`
    	    if  ps -p $PID >&- ; then
    			# service is running
    			return 0
    		else
    			# service is stopped
    			return 1
    	    fi
    	else
    		# pid file was not found, may be server was not started before
    		return 1
    	fi
    }
    
    # Status the service
    status() {
    	is_service_running
    	service_status=$?
    		
    	if [ "${service_status}" -eq 0 ]; then
    		echo "${PRODUCT_NAME} service is running"
    		return ${LSB_EXIT_OK}
    	elif [ "${service_status}" -eq 1 ]; then
    		echo "$PRODUCT_NAME service is stopped"
    		return ${LSB_EXIT_OK}
    	else 
    		echo "$PRODUCT_NAME service status is unknown"
    		return ${LSB_EXIT_GENERIC}
        fi
    }
    
    # Start the service
    start() {
    	if is_service_running; then
    		echo "${PRODUCT_NAME} service is already running"
    		return ${LSB_EXIT_OK}
    	fi
    	
    	echo "starting ${PRODUCT_NAME} service..."  
    	su - ${USER} -c "export JAVA_HOME=${JAVA_HOME}; ${CMD} start"
    	
    	is_service_running
    	service_status=$?
    	while [ "$service_status" -ne "0" ]
    	do
    		sleep 1;
    		is_service_running
    		service_status=$?
    	done
    	
    	echo "${PRODUCT_NAME} service started"
    	return ${LSB_EXIT_OK}
    }
    
    # Restart the service
    restart() {
    	echo "restarting ${PRODUCT_NAME} service..."
    	su - ${USER} -c "export JAVA_HOME=${JAVA_HOME}; ${CMD} restart"
        echo "${PRODUCT_NAME} service restarted"
    	return ${LSB_EXIT_OK}
    }
    
    # Stop the service
    stop() {
    	if ! is_service_running; then
    		echo "${PRODUCT_NAME} service is already stopped"
    		return ${LSB_EXIT_OK}
    	fi
    	
    	echo "stopping ${PRODUCT_NAME} service..."
    	su - ${USER} -c "export JAVA_HOME=${JAVA_HOME}; ${CMD} stop"
    	
    	is_service_running
    	service_status=$?
    	while [ "$service_status" -eq "0" ]
    	do
    		sleep 1;
    		is_service_running
    		service_status=$?
    	done
    	
    	echo "${PRODUCT_NAME} service stopped"
    	return ${LSB_EXIT_OK}
    }
    ### main logic ###
    case "$1" in
    start)
        start
        ;;
    stop|graceful-stop)
        stop
        ;;
    status)
        status
        ;;
    restart|reload|force-reload)
        restart
        ;;
    *)
       echo $"usage: $0 {start|stop|graceful-stop|restart|reload|force-reload|status}"
       exit 1
    esac
    exit $?
  12. Create a CRM resource for PPaaS:

    crm configure primitive PPAAS lsb::ppaas op monitor interval=15s
  13. Create a CRM resource group and add FAILOVER-IP and PPAAS resources:

    crm configure group FAILOVER-IP-RESOURCE-GROUP FAILOVER-IP PPAAS
  14. Configure a colocation dependency between FAILOVER-IP and PPAAS. 

    This will ensure both FAILOVER-IP and PPAAS resources staying in the same host.

     crm configure colocation FAILOVER-IP-RESOURCE-GROUP-COLOCATION inf: FAILOVER-IP PPAAS

Deleting a resource

  1. Use the following to delete a resource:

    crm_resource -D -r my_first_ip -t primitive

Deleting a resource group

  1. Use the following to delete a resource group:

    crm_resource -D -r my_first_group -t group

References

For more information on configuring HA using Pacemaker/Heartbeat refer the following:

[1] https://www.zivtech.com/blog/setting-ip-failover-heartbeat-and-pacemaker-ubuntu-lucid

[2] http://www.linux-ha.org/doc/users-guide/_creating_an_initial_heartbeat_configuration.html

[3] http://foaa.de/old-blog/2010/10/intro-to-pacemaker-on-heartbeat/trackback/index.html

[4] http://code.naishe.in/2012/11/high-availability-ngnix-using-heartbeat.htm

[5] http://opentodo.net/2012/04/configuring-a-failover-cluster-with-heartbeat-pacemaker/

[6] http://doc.opensuse.org/products/draft/SLE-HA/SLE-ha-guide_sd_draft/man.crmresource.html

 

com.atlassian.confluence.content.render.xhtml.migration.exceptions.UnknownMacroMigrationException: The macro 'next_previous_links2' is unknown.