| Home | Docs | Support | Buy | Blog | Forums |
|
Create a HighAvailable zenossThis howto explains how to create a zenoss cluster using Opensource tools.
Note: This is still WIP so please be patient
Introduction
Our previous monitoring solutions was Nagios. To achieve a 99% uptime for our monitoring we had a cold standby for nagios. Setup
I have created a 2 node cluster. One system is a virtual one running on an ESX v2 server To create a cluster you will need a shared storrage medium.
Here is a drawing of my setup:
Okay this seems a bit difficult but in fact it isn't. Hearbeat is a linux HighAvailable project to create clusters.
Are configured to run on the active server. If heartbeat detects the active server down, or having problems with one of these services hearbeat will be smart enough to move these services to the standby server.
This also includes switching the virtual IP to the standby server.
I'm using heartbeat v2 and this allows me to create rules for running the services. One of the rules is to prefer the hardware node. So lets say your hardware server went down and need some maintenance. heartbeat will move the services to the other node (virtual) Once the hardware is back up and running, heartbeat will notice this and move the services back to the hardware server.
This means there is a small interruption of services while moving them to the other node. So heartbeat is good for High Uptime stuff but not for Load clustering.
You may have used heartbeat before. DRBD what is this?
DRBD is a cheap way to create a shared storage This is my /etc/drbd.conf #
# please have a a look at the example configuration file in
# /usr/share/doc/drbd.conf
#
#
global {
minor-count 1;
usage-count no; # Participate in DRBD's online usage counter at http://usage.drbd.org
}
resource zenoss {
protocol C;
startup {
wfc-timeout 30;
degr-wfc-timeout 60;
}
disk {
on-io-error detach;
fencing resource-only;
}
handlers {
pri-on-incon-degr "echo o > /proc/sysrq-trigger ; halt -f"; # reboot the system after a connection fail
pri-lost-after-sb "echo o > /proc/sysrq-trigger ; halt -f";
outdate-peer "/usr/lib/heartbeat/drbd-peer-outdater";
}
net {
after-sb-0pri discard-least-changes; # Self healing if split brean
after-sb-1pri call-pri-lost-after-sb;
max-buffers 2048; # datablock buffers used before writing to disk.
ko-count 4; # Peer is dead if this count is exceeded.
}
syncer {
rate 12M;
al-extents 257;
}
on zenoss0101 {
device /dev/drbd0;
disk /dev/sda3;
address 192.168.1.90:7789;
meta-disk internal;
}
on zenoss0102 {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.1.91:7789;
meta-disk internal;
}
}
g
Hearbeat how does it work?
dfdsfds
This is my /etc/ha.d/ha.cf # Created by wdhaeseleer # /etc/ha.d/ha.cf # Config created on 19-03-2008 use_logd yes # Log to the the daemon debug 1 # Set debug level udpport 694 # Send on this UDP port ucast eth0 192.168.1.90 # Use unicast to send the hearbeat ucast eth0 192.168.1.91 # Use unicast to send the hearbeat keepalive 1000ms # Send a heartbeat every warntime 7000ms # A node is in danger after deadtime 30000ms # Declare a node down after initdead 40000ms # Declare a node down on startup after autojoin any # allow autojoining crm on # This is special to enable hearbeat v2 watchdog /dev/watchdog respawn hacluster /usr/lib/heartbeat/dopd apiauth dopd gid=haclient uid=hacluster node zenoss0101 # This is node1 node zenoss0102 # This is node2
Here is an explanation of each item: http://www.linux-ha.org/ha.cf The authkeys configuration file contains information for Heartbeat to use when authenticating cluster members. It cannot be readable or writable by anyone other than root. This is needs to be identical on each node of the cluster. Read here for more info.
this is my /etc/ha.d/authkeys auth 1 1 sha1 e75dd0d3d97ea86bc07480ae6d9406d0 Hearbeat need to startup at boot time. Execute this to start heartbeat on boot. # sudo chkconfig heartbeat on if this does not work you could do it manually. sudo ln -s ../init.d/heartbeat /etc/rc0.d/K25heartbeat sudo ln -s ../init.d/heartbeat /etc/rc1.d/K25heartbeat sudo ln -s ../init.d/heartbeat /etc/rc2.d/S25heartbeat sudo ln -s ../init.d/heartbeat /etc/rc3.d/S25heartbeat sudo ln -s ../init.d/heartbeat /etc/rc4.d/S25heartbeat sudo ln -s ../init.d/heartbeat /etc/rc5.d/S25heartbeat sudo ln -s ../init.d/heartbeat /etc/rc6.d/K25heartbeat Monitor this cluster
We have zenoss running so this is the most ideal situation you can have. This makes monitoring the cluster really easy?
What should we monitor:
|

