HA.CF(5) | Configuration Files | HA.CF(5) |
ha.cf - Configuration file for the Heartbeat cluster messaging layer
/etc/ha.d/ha.cf is read by heartbeat(8) upon node start-up. It lists the communication facilities enabled between nodes, enables or disables certain features, and optionally lists the cluster nodes by host name.
This file can safely be made world readable, but should be writable only by root.
Some directives in ha.cf are global in nature. The order of these global options is important in configuring the ha.cf file, since each directive is interpreted as it is encountered in ha.cf.
These directives are use_logd and udpport. It is recommended that these be placed first in the ha.cf file when they are entered.
Other directives in this category are baud, logfacility, logfile, and debugfile, but those directives are deprecated and should no longer be used.
The following directives are supported in ha.cf (listed here in alphabetical order):
apiauth
apiauth apigroupname [uid=uid1,uid2 ...] [gid=gid1,gid2 ...]
You can specify either a uid list, or a gid list, or both. However you must specify either a uid list or a gid list. If you include both a uid list and a gid list, then a process is authorized to connect to that API group if if it is either in the uid-list or it is in the gid-list.
The API group name default has special meaning. If it is specified, it will be used for authorizing clients without any API group name, and all client groups not identified by any other apiauth directive.
Unless you specify otherwise in the ha.cf file, certain services will be provided default authorizations as follows:
Table 1. Default service authorizations
Service | Default apiauth |
ipfail | uid=hacluster |
ccm | gid=haclient |
ping | gid=haclient |
cl_status | gid=haclient |
lha-snmpagent | uid=root |
crmd | uid=hacluster |
autojoin
The values you can give for the autojoin directive have the following meanings:
Note that the set of nodes currently considered part of the cluster is kept in the hostcache file. With autojoin enabled, the node directive is no longer authoritative - the hostcache file is.
bcast
bcast eth0 eth1 # on Linux systems bcast le0 # for Solaris systems
compression
It could be either zlib or bz2, depending on whether you have the corresponding library in the system. You can check /usr/lib/heartbeat/plugins/compress, to see what compression module is available. Requires cluster-glue >= 1.0.10.
If this directive is not set, there will be no compression.
compression_threshold
conn_logd_time
conn_logd_time 60 #60 seconds
The default is 60 seconds.
coredumps
The allowed values are true and false.
crm
pacemaker
When set to respawn, the directive automatically implies, with proper search paths, something similar to:
apiauth cib uid=hacluster apiauth crmd uid=hacluster apiauth stonithd uid=root apiauth stonithd-ng uid=root apiauth attrd uid=hacluster apiauth pingd uid=root respawn hacluster ccm respawn hacluster cib respawn root stonithd respawn root lrmd respawn hacluster attrd respawn hacluster pengine respawn hacluster crmd
crm_daemon_dir
crmd_spawns_pengine
deadtime
debug
The debug level of the system can also be specified on the command line using the -d option. Additionally, the debug level of the system can be dynamically changed by sending the heartbeat process SIGUSR1 and SIGUSR2 signals. SIGUSR1 raises the debug level, and SIGUSR2 lowers it.
hbgenmethod time|file
initdead
initdead 30
In some switched network environments, switches engage in a spanning tree algorithm whenever a NIC connects to a port. This can take a long time to complete, and it is only necessary if the NIC being connected is another switch. If this is the case, you may be able to configure certain NICs as not being switches and shrink the connection delay significantly. If not, you'll need to raise initdead to make this problem go away.
If this is set too low, you'll see one node declare the other as dead.
keepalive
logfacility
The possible values for logfacility vary by operating system, but some of the most common ones are {auth, authpriv, daemon, syslog, user, local0, local1, local2, local3, local4, local5, local6, local7}.
A sample logfacility directive is shown below:
logfacility local7
If you want to disable logging to syslog:
logfacility none
mcast
mcast dev mcast-group udp-port ttl 0
A sample mcast directive is shown below:
mcast eth0 239.0.0.1 694 1 0
mcast6
mcast6 [device] [mcast6 group] [port] [mcast6 hops] [mcast6 loop]
For example, using link-local scope with some "transient" group:
mcast6 eth0 ff12::1:2:3:4 694 1 0
For most heartbeat uses, addresses should be taken from:
ff12::/16
Plausibility checking code during config file parsing will reject some, but will probably not be able to catch all unsuitable addresses. Please understand the IPv6 multicast addressing scheme first.
msgfmt classic|netstring
When in doubt, leave the default (classic).
node
node nodename1 nodename2 ...
Node names in the directive must match the "uname -n" of that machine.
You can declare multiple node names in one directive. You can also use the directive multiple times. Normally every node in the cluster must be listed in the ha.cf file, including the current node, unless the autojoin directive is enabled.
The node directive is not completely authoritative with regard to nodes heartbeat will communicate with. If a node has ever been added in the past, it will tend to remain in the hostcache file more until it's manually removed.
realtime on|off
The default is on.
rtprio
A sample rtprio directive is shown below:
rtprio 5
ucast
The general syntax of a ucast directive is:
ucast dev peer-ip-address
Where dev is the device to use when talking to the peer, and peer-ip-address is the IP address we will send packets to.
A sample ucast directive is shown below:
ucast eth0 10.10.10.133
This directive will cause us to send packets to 10.10.10.133 over interface eth0.
Note that ucast directives which go to the local machine are effectively ignored. This allows the ha.cf directives on all machines to be identical.
ucast6
The general syntax of a ucast6 directive is:
ucast dev peer-ipv6-address
Where dev is the device to use when talking to the peer, and peer-ip-address is the IP address we will send packets to.
A sample ucast directive is shown below:
ucast6 eth1 fe80::5054:ff:fe29:3949
This directive will cause us to send packets to the specified (in this example: link-local) address via the specified interface.
As the sockets will be bound to the specified interface, you have to ensure the specified address is in fact reachable via that interface.
For link-local addresses, you may explicitly specify the "scope-id" (in the example you would add %eth1 to the address). If you do, it has to match the device. If you leave it off, the specified device name is implied.
Note that ucast6 directives which go to the local machine are effectively ignored. This allows the ha.cf directives on all machines to be identical.
udpport
The default value for this parameter is the the port ha-cluster in /etc/services (if present), or 694 if port ha-cluster is not in /etc/services. 694 is the IANA registered port number for Heartbeat (a.k.a. ha-cluster).
A sample udpport directive is shown below.
udpport 694
You have to configure udpport (in ha.cf) before you configure ucast or bcast, if not heartbeat will use the default port (694).
use_logd on|off
If the logging daemon is used, all log messages will be sent through IPC to the logging daemon, which then writes them into log files. In case the logging daemon dies (for whatever reason), a warning message will be logged and all messages will be written to log files directly.
If the logging daemon is used, logfile/debugfile/logfacility in this file are not meaningful any longer. You should check the config file for logging daemon (the default is /etc/logd.cf).
If use_logd is not used, all log messages will be written to log files directly.
The logging daemon is started/stopped in heartbeat script.
Setting use_logd to "on" is recommended.
uuidfrom
For certain kinds of installations (those booting from CDs or other read-only media), it is impossible for heartbeat to save a generated to disk as it normally does. In these cases, one can use the uuidfrom directive to instruct heartbeat to use the nodename as though it were a UUID, by specifying uuidfrom nodename.
All possible legal uuidfrom directives are shown below.
uuidfrom file uuidfrom nodename
warntime
The warntime value is specified according to the HeartbeatTimeSyntax. A sample warntime specification is shown below.
warntime 10 # 10 seconds
The warntime directive is important for tuning deadtime
The following directives are interpreted by the configuration file parser for historical reasons, but should be considered deprecated and should no longer be used.
auto_failback
This option has been replaced the configurable failback policies in Pacemaker, and should no longer be used.
baud
This option is obsolete as serial links should not be used in Pacemaker clusters.
deadping
This feature has been replaced by the more flexible pingd resource agent in Pacemaker, and should no longer be used.
debugfile
This directive is ignored when use_logd is specified. Enabling use_logd is the recommended approach.
hbaping
This directive was never fully supported in Heartbeat (requiring manual modifications to the code base) and should not be used.
hopfudge
This option applies to serial links only, which are deprecated.
logfile
This directive is ignored when use_logd is specified. Enabling use_logd is the recommended approach.
ping
ping ip-address ...
Each IP address listed in a ping directive is considered to be independent. That is, connectivity to each node is considered to be equally important.
In order to declare that a group of nodes are equally qualified for a particular function, and that the presence of any of them indicates successful communication, use the ping_group directive.
This feature has been replaced by the more flexible pingd resource agent in Pacemaker, and should no longer be used.
ping_group
ping_group group-name ip-address ...
Each IP address listed in a ping_group directive is considered to be related, and connectivity to any one node is considered to be connectivity to the group.
A ping group is considered by Heartbeat to be a single cluster node (group-name). The ability to communicate with any of the group members means that the group-name member is reachable. This is useful when (for example) two different routers may be used to contact the internet, depending on which is up, or when finding an appropriate reliable single ping node is difficult.
This feature has been replaced by the more flexible pingd resource agent in Pacemaker, and should no longer be used.
respawn
This functionality was primarily designed for the legacy ipfail program, which has been replaced by the more flexible pingd resource agent in Pacemaker. Thus, this directive should no longer be used, except when it is implicitly generated by pacemaker yes.
serial
A few sample serial directives are shown below:
serial /dev/ttyS0 /dev/ttyS1 # Linux serial /dev/cuaa0 # FreeBSD serial /dev/cua/a # Solaris
The baud directive is used to configure the baud rate for the port(s) if the baud directive is specified before the serial directive, otherwise the default baud rate will be used.
Using this option is strongly discouraged in Pacemaker clusters, as its CIB updates can easily hit practical message size limits for serial links, with undefined results.
stonith
This functionality has been replaced by STONITH agents in Pacemaker.
stonith_host
This functionality has been replaced by STONITH agents in Pacemaker.
traditional_compression on|off
watchdog
It is the purpose of a watchdog device to shut the machine down if Heartbeat does not hear its own heartbeats as often as it thinks it should. This keeps things like scheduler bugs from becoming split-brain configurations.
The general syntax of a watchdog directive is:
watchdog watchdog-device-name
A sample watchdog directive is shown below:
watchdog /dev/watchdog
The most common watchdog device currently used with general Linux systems is the softdog device. The softdog device is a software-based watchdog device and is usually referred to as /dev/watchdog - although like most UNIX devices, this is a convention not a rule.
This functionality has been replaced by cluster self-monitoring and STONITH resource agents in Pacemaker. This directive should no longer be used.
The following directives must always be present in ha.cf:
Below is an example ha.cf for a 2-node Pacemaker cluster with redundant network communication paths:
# understood time specifications: # 2 | 2 seconds # 1.7 | 1.7 seconds # 2000ms | 2 seconds # 1700ms | 1.7 seconds keepalive 1 warntime 6 deadtime 10 initdead 120 # your choice. # debug 1 # or more # Or use kill -USR1 to increase at runtime, USR2 to decrease. # LOGGING: your choice. # # Note: if you want pacemaker to NOT write any log file at all, # you may need to explicitly use: logfile /dev/null # (and use a recent cluster glue!) # # logfile /var/log/ha.log # debugfile /var/log/ha-debug.log # logfacility local7 # or use logd, which is configured in /etc/logd.cf # use_logd on mcast eth0 239.192.0.42 694 1 0 ucast eth1 10.0.0.7 ucast eth1 10.0.0.8 node alice node bob compression bz2 compression_threshold 20 traditional_compression on # if you want to pass certain environments to child processes (pacemaker), # you could do: # env logpriority=debug # Where to look for pacemaker daemons first. # Use if you use a different pacemaker path: # crm_daemon_dir /usr/libexec/pacemaker # If pacemaker crmd spawns the pengine itself, # it sometimes "forgets" to kill the pengine on shutdown, # which later may confuse the system after cluster restart. # Tell the system that Heartbeat is supposed to control the pengine directly. crmd_spawns_pengine off pacemaker respawn
Alan Robertson <alanr@unix.sh>
Lars Ellenberg <lars.ellenberg@linbit.com>
Florian Haas <florian.haas@linbit.com>
2 Dec 2014 | Heartbeat 3.0.6 |