Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Table of Contents
maxLevel4

What is Pacemaker?

Pacemaker is a cluster resource manager (CRM). It achieves maximum availability for your cluster services (resources) by detecting and recovering node and resource-level failures using the messaging and membership capabilities provided by your preferred cluster infrastructure (Corosync or Heartbeat).

Refer Pacemaker documentation for more information.

What is Heartbeat?

Heartbeat is a daemon that provides cluster infrastructure (communication and membership) services to its clients. This allows clients to know about the presence (or disappearance!) of peer processes on other machines and to easily exchange messages with them.

In order to be useful to users, the Heartbeat daemon needs to be combined with a CRM, which has the task of starting and stopping the services (IP addresses, web servers, etc.) making clusters highly available. Pacemaker is the preferred cluster resource manager for clusters based on Heartbeat.

Prerequisites

  • Two physical or virtual hosts running Ubuntu 12.04 64 bit OS.
  • Pacemaker 1.1.6
  • Heartbeat 3.0.5

Configuring Pacemaker/Heartbeat for PPaaS

  1. SSH into the above VM instance and install Pacemaker and Heartbeat:

    Code Block
    apt-get install pacemaker heartbeat
  2. Switch to root user:

    Code Block
    sudo su
  3. Create the Heartbeat configuration file at the following location: /etc/ha.d/ha.cf

    Code Block
    enable pacemaker, without stonith
    crm             yes
    # define log file
    logfile /var/log/ha-log
    # warning of soon be dead
    warntime        10
    # declare a host (the other node) dead after:
    deadtime        20
    # dead time on boot (could take some time until net is up)
    initdead        120
    # time between heartbeats
    keepalive       2
    # the nodes
    node node1 # set node1 hostname
    node node2 # set node2 hostname
    # heartbeats, over dedicated replication interface
    ucast           eth1 10.186.175.16 # set node1 network-interface and ip address
    ucast           eth1 54.211.110.217 # set node2 network-interface and ip address

     

  4. Create the authentication key file and set permissions in one of the hosts:

    Code Block
    ( echo -ne "auth 1\n1 sha1 "; \
    dd if=/dev/urandom bs=512 count=1 | openssl md5 ) \
    > /etc/ha.d/authkeys
    
    chmod 0600 /etc/ha.d/authkeys
  5. Copy the above authkeys file to each host located at /etc/ha.d/authkeys.
  6. Restart Hearbeat service:

    Code Block
    service heartbeat restart
  7. Check the status of the Pacemaker cluster using CRM:

    Info

    All nodes in the cluster should be in the online state. Recheck the heartbeat configuration if a cluster is in the offline state.

    Code Block
    crm status
    ============
    Last updated: Wed Oct 15 11:25:05 2014
    Last change: Wed Oct 15 11:21:51 2014 via crmd on ip-10-186-175-16
    Stack: Heartbeat
    Current DC: ip-10-186-175-16 (d16ccc5c-2641-42b6-b46a-57a0b32fddc9) - partition with quorum
    Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c
    2 Nodes configured, unknown expected votes
    0 Resources configured.
    ============
    Online: [ ip-10-186-175-16 ip-10-153-165-178 ]
  8. Disable STONITH:

    Code Block
    crm configure property stonith-enabled=false
  9. Create a Failover IP resource to manage the virtual IP address:

    Code Block
    crm configure primitive FAILOVER-IP ocf:heartbeat:IPaddr params ip=192.168.10.20 cidr_netmask="255.255.255.0" op monitor interval=10s
  10. Secure copy (SCP) java and PPaaS packages to each host and extract them under folder /opt .

  11. Create an init.d script for PPaaS with the following:
    Update the values of USER, JAVA_HOME and PRODUCT_HOME variables.

    Code Block
    #!/bin/sh
    ### BEGIN INIT INFO
    # Provides:          ppaas
    # Required-Start:    $local_fs $remote_fs $network $syslog $named
    # Required-Stop:     $local_fs $remote_fs $network $syslog $named
    # Default-Start:     2 3 4 5
    # Default-Stop:      0 1 6
    # X-Interactive:     true
    # Short-Description: Start/stop ppaas server
    ### END INIT INFO
    
    USER="vagrant"
    PRODUCT_NAME="ppaas"
    JAVA_HOME="/opt/jdk1.7.0_60"
    PRODUCT_HOME="/opt/ppaas_4.1.0"
    PID_FILE="${PRODUCT_HOME}/wso2carbon.pid"
    CMD="${PRODUCT_HOME}/bin/wso2server.sh"
    
    # LSB exit codes:
    # ftp://ftp.nomadlinux.com/nomad-2/dist/heartbeat-1.2.5/include/clplumbing/lsb_exitcodes.h
    
    LSB_EXIT_OK=0
    LSB_EXIT_GENERIC=1
    LSB_EXIT_EINVAL=2
    LSB_EXIT_ENOTSUPPORTED=3
    LSB_EXIT_EPERM=4
    LSB_EXIT_NOTINSTALLED=5
    LSB_EXIT_NOTCONFIGED=6
    LSB_EXIT_NOTRUNNING=7
    
    is_service_running() {
    	if [ -e ${PID_FILE} ]; then
    		PID=`cat ${PID_FILE}`
    	    if  ps -p $PID >&- ; then
    			# service is running
    			return 0
    		else
    			# service is stopped
    			return 1
    	    fi
    	else
    		# pid file was not found, may be server was not started before
    		return 1
    	fi
    }
    
    # Status the service
    status() {
    	is_service_running
    	service_status=$?
    		
    	if [ "${service_status}" -eq 0 ]; then
    		echo "${PRODUCT_NAME} service is running"
    		return ${LSB_EXIT_OK}
    	elif [ "${service_status}" -eq 1 ]; then
    		echo "$PRODUCT_NAME service is stopped"
    		return ${LSB_EXIT_OK}
    	else 
    		echo "$PRODUCT_NAME service status is unknown"
    		return ${LSB_EXIT_GENERIC}
        fi
    }
    
    # Start the service
    start() {
    	if is_service_running; then
    		echo "${PRODUCT_NAME} service is already running"
    		return ${LSB_EXIT_OK}
    	fi
    	
    	echo "starting ${PRODUCT_NAME} service..."  
    	su - ${USER} -c "export JAVA_HOME=${JAVA_HOME}; ${CMD} start"
    	
    	is_service_running
    	service_status=$?
    	while [ "$service_status" -ne "0" ]
    	do
    		sleep 1;
    		is_service_running
    		service_status=$?
    	done
    	
    	echo "${PRODUCT_NAME} service started"
    	return ${LSB_EXIT_OK}
    }
    
    # Restart the service
    restart() {
    	echo "restarting ${PRODUCT_NAME} service..."
    	su - ${USER} -c "export JAVA_HOME=${JAVA_HOME}; ${CMD} restart"
        echo "${PRODUCT_NAME} service restarted"
    	return ${LSB_EXIT_OK}
    }
    
    # Stop the service
    stop() {
    	if ! is_service_running; then
    		echo "${PRODUCT_NAME} service is already stopped"
    		return ${LSB_EXIT_OK}
    	fi
    	
    	echo "stopping ${PRODUCT_NAME} service..."
    	su - ${USER} -c "export JAVA_HOME=${JAVA_HOME}; ${CMD} stop"
    	
    	is_service_running
    	service_status=$?
    	while [ "$service_status" -eq "0" ]
    	do
    		sleep 1;
    		is_service_running
    		service_status=$?
    	done
    	
    	echo "${PRODUCT_NAME} service stopped"
    	return ${LSB_EXIT_OK}
    }
    ### main logic ###
    case "$1" in
    start)
        start
        ;;
    stop|graceful-stop)
        stop
        ;;
    status)
        status
        ;;
    restart|reload|force-reload)
        restart
        ;;
    *)
       echo $"usage: $0 {start|stop|graceful-stop|restart|reload|force-reload|status}"
       exit 1
    esac
    exit $?
  12. Create a CRM resource for PPaaS:

    Code Block
    crm configure primitive PPAAS lsb::ppaas op monitor interval=15s
  13. Create a CRM resource group and add FAILOVER-IP and PPAAS resources:

    Code Block
    crm configure group FAILOVER-IP-RESOURCE-GROUP FAILOVER-IP PPAAS
  14. Configure a colocation dependency between FAILOVER-IP and PPAAS

    Info

    This will ensure both FAILOVER-IP and PPAAS resources staying in the same host.

    Code Block
     crm configure colocation FAILOVER-IP-RESOURCE-GROUP-COLOCATION inf: FAILOVER-IP PPAAS

Deleting a resource

  1. Use the following to delete a resource:

    Code Block
    crm_resource -D -r my_first_ip -t primitive

Deleting a resource group

  1. Use the following to delete a resource group:

    Code Block
    crm_resource -D -r my_first_group -t group

References

For more information on configuring HA using Pacemaker/Heartbeat refer the following:

...