Configuring HA Using Pacemaker and Heartbeat
The following sections cover the information required to configure high availability (HA) for PPaaS using Pacemaker/Heartbeat:
What is Pacemaker?
Pacemaker is a cluster resource manager (CRM). It achieves maximum availability for your cluster services (resources) by detecting and recovering node and resource-level failures using the messaging and membership capabilities provided by your preferred cluster infrastructure (Corosync or Heartbeat).
Refer Pacemaker documentation for more information.
What is Heartbeat?
Heartbeat is a daemon that provides cluster infrastructure (communication and membership) services to its clients. This allows clients to know about the presence (or disappearance!) of peer processes on other machines and to easily exchange messages with them.
In order to be useful to users, the Heartbeat daemon needs to be combined with a CRM, which has the task of starting and stopping the services (IP addresses, web servers, etc.) making clusters highly available. Pacemaker is the preferred cluster resource manager for clusters based on Heartbeat.
Prerequisites
- Two physical or virtual hosts running Ubuntu 12.04 64 bit OS.
- Pacemaker 1.1.6
- Heartbeat 3.0.5
Configuring Pacemaker/Heartbeat for PPaaS
SSH into the above VM instance and install Pacemaker and Heartbeat:
apt-get install pacemaker heartbeat
Switch to root user:
sudo su
Create the Heartbeat configuration file at the following location:
/etc/ha.d/ha.cf
enable pacemaker, without stonith crm yes # define log file logfile /var/log/ha-log # warning of soon be dead warntime 10 # declare a host (the other node) dead after: deadtime 20 # dead time on boot (could take some time until net is up) initdead 120 # time between heartbeats keepalive 2 # the nodes node node1 # set node1 hostname node node2 # set node2 hostname # heartbeats, over dedicated replication interface ucast eth1 10.186.175.16 # set node1 network-interface and ip address ucast eth1 54.211.110.217 # set node2 network-interface and ip address
Â
Create the authentication key file and set permissions in one of the hosts:
( echo -ne "auth 1\n1 sha1 "; \ dd if=/dev/urandom bs=512 count=1 | openssl md5 ) \ > /etc/ha.d/authkeys chmod 0600 /etc/ha.d/authkeys
- Copy the above
authkeys
file to each host located at/etc/ha.d/authkeys
. Restart Hearbeat service:
service heartbeat restart
Check the status of the Pacemaker cluster using CRM:
All nodes in the cluster should be in the online state. Recheck the heartbeat configuration if a cluster is in the offline state.
crm status ============ Last updated: Wed Oct 15 11:25:05 2014 Last change: Wed Oct 15 11:21:51 2014 via crmd on ip-10-186-175-16 Stack: Heartbeat Current DC: ip-10-186-175-16 (d16ccc5c-2641-42b6-b46a-57a0b32fddc9) - partition with quorum Version: 1.1.6-9971ebba4494012a93c03b40a2c58ec0eb60f50c 2 Nodes configured, unknown expected votes 0 Resources configured. ============ Online: [ ip-10-186-175-16 ip-10-153-165-178 ]
Disable
STONITH
:crm configure property stonith-enabled=false
Create a Failover IP resource to manage the virtual IP address:
crm configure primitive FAILOVER-IP ocf:heartbeat:IPaddr params ip=192.168.10.20 cidr_netmask="255.255.255.0" op monitor interval=10s
Secure copy (SCP) java and PPaaS packages to each host and extract them under folder
 /opt
.Create an
init.d
script for PPaaS with the following:
Update the values ofÂUSER, JAVA_HOME
andPRODUCT_HOME
variables.#!/bin/sh ### BEGIN INIT INFO # Provides: ppaas # Required-Start: $local_fs $remote_fs $network $syslog $named # Required-Stop: $local_fs $remote_fs $network $syslog $named # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # X-Interactive: true # Short-Description: Start/stop ppaas server ### END INIT INFO USER="vagrant" PRODUCT_NAME="ppaas" JAVA_HOME="/opt/jdk1.7.0_60" PRODUCT_HOME="/opt/ppaas_4.1.0" PID_FILE="${PRODUCT_HOME}/wso2carbon.pid" CMD="${PRODUCT_HOME}/bin/wso2server.sh" # LSB exit codes: # ftp://ftp.nomadlinux.com/nomad-2/dist/heartbeat-1.2.5/include/clplumbing/lsb_exitcodes.h LSB_EXIT_OK=0 LSB_EXIT_GENERIC=1 LSB_EXIT_EINVAL=2 LSB_EXIT_ENOTSUPPORTED=3 LSB_EXIT_EPERM=4 LSB_EXIT_NOTINSTALLED=5 LSB_EXIT_NOTCONFIGED=6 LSB_EXIT_NOTRUNNING=7 is_service_running() { if [ -e ${PID_FILE} ]; then PID=`cat ${PID_FILE}` if ps -p $PID >&- ; then # service is running return 0 else # service is stopped return 1 fi else # pid file was not found, may be server was not started before return 1 fi } # Status the service status() { is_service_running service_status=$? if [ "${service_status}" -eq 0 ]; then echo "${PRODUCT_NAME} service is running" return ${LSB_EXIT_OK} elif [ "${service_status}" -eq 1 ]; then echo "$PRODUCT_NAME service is stopped" return ${LSB_EXIT_OK} else echo "$PRODUCT_NAME service status is unknown" return ${LSB_EXIT_GENERIC} fi } # Start the service start() { if is_service_running; then echo "${PRODUCT_NAME} service is already running" return ${LSB_EXIT_OK} fi echo "starting ${PRODUCT_NAME} service..." su - ${USER} -c "export JAVA_HOME=${JAVA_HOME}; ${CMD} start" is_service_running service_status=$? while [ "$service_status" -ne "0" ] do sleep 1; is_service_running service_status=$? done echo "${PRODUCT_NAME} service started" return ${LSB_EXIT_OK} } # Restart the service restart() { echo "restarting ${PRODUCT_NAME} service..." su - ${USER} -c "export JAVA_HOME=${JAVA_HOME}; ${CMD} restart" echo "${PRODUCT_NAME} service restarted" return ${LSB_EXIT_OK} } # Stop the service stop() { if ! is_service_running; then echo "${PRODUCT_NAME} service is already stopped" return ${LSB_EXIT_OK} fi echo "stopping ${PRODUCT_NAME} service..." su - ${USER} -c "export JAVA_HOME=${JAVA_HOME}; ${CMD} stop" is_service_running service_status=$? while [ "$service_status" -eq "0" ] do sleep 1; is_service_running service_status=$? done echo "${PRODUCT_NAME} service stopped" return ${LSB_EXIT_OK} } ### main logic ### case "$1" in start) start ;; stop|graceful-stop) stop ;; status) status ;; restart|reload|force-reload) restart ;; *) echo $"usage: $0 {start|stop|graceful-stop|restart|reload|force-reload|status}" exit 1 esac exit $?
Create a CRM resource for PPaaS:
crm configure primitive PPAAS lsb::ppaas op monitor interval=15s
Create a CRM resource group and add
FAILOVER-IP
andPPAAS
resources:crm configure group FAILOVER-IP-RESOURCE-GROUP FAILOVER-IP PPAAS
Configure a colocation dependency between
FAILOVER-IP
andPPAAS
.ÂThis will ensure bothÂ
FAILOVER-IP
 andÂPPAAS
 resources staying in the same host. crm configure colocation FAILOVER-IP-RESOURCE-GROUP-COLOCATION inf: FAILOVER-IP PPAAS
Deleting a resource
Use the following to delete a resource:
crm_resource -D -r my_first_ip -t primitive
Deleting a resource group
Use the following to delete a resource group:
crm_resource -D -r my_first_group -t group
References
For more information on configuring HA using Pacemaker/Heartbeat refer the following:
[1] https://www.zivtech.com/blog/setting-ip-failover-heartbeat-and-pacemaker-ubuntu-lucid
[2] http://www.linux-ha.org/doc/users-guide/_creating_an_initial_heartbeat_configuration.html
[3] http://foaa.de/old-blog/2010/10/intro-to-pacemaker-on-heartbeat/trackback/index.html
[4] http://code.naishe.in/2012/11/high-availability-ngnix-using-heartbeat.htm
[5] http://opentodo.net/2012/04/configuring-a-failover-cluster-with-heartbeat-pacemaker/
[6] http://doc.opensuse.org/products/draft/SLE-HA/SLE-ha-guide_sd_draft/man.crmresource.html
Â