HOSTS.CFG(5) | File Formats Manual | HOSTS.CFG(5) |
hosts.cfg - Main Xymon configuration file
The hosts.cfg(5) file is the most important configuration file for all of the Xymon programs. This file contains the full list of all the systems monitored by Xymon, including the set of tests and other configuration items stored for each host.
Each line of the file defines a host. Blank lines and lines starting with a hash mark (#) are treated as comments and ignored. Long lines can be broken up by putting a backslash at the end of the line and continuing the entry on the next line.
The format of an entry in the hosts.cfg file is as follows:
IP-address hostname # tag1 tag2 ...
The IP-address and hostname are mandatory; all of the tags are optional. Listing a host with only IP-address and hostname will cause a network test to be executed for the host - the connectivity test is enabled by default, but no other tests.
The optional tags are then used to define which tests are relevant for the host, and also to set e.g. the time-interval used for availability reporting by xymongen(1)
An example of setting up the hosts.cfg file is in the Xymon on-line documentation (from the Help menu, choose "Configuring Monitoring"). The following describes the possible settings in a hosts.cfg file supported by Xymon.
The "filename" argument should point to a file that uses the same syntax as hosts.cfg. The filename can be an absolute filename (if it begins with a '/'), or a relative filename - relative file names are prefixed with the directory where the main hosts.cfg file is located (usually $XYMONHOME/etc/).
You can nest include tags, i.e. a file that is included from the main hosts.cfg file can itself include other files.
Note that "noclear" also affects the behaviour of network tests; see below.
By using the "prefer" tag you tell xymongen that this host definition should be used.
Note: This only applies to hosts that are defined multiple times in the hosts.cfg file, although it will not hurt to add it on other hosts as well.
These tags are processed by the xymongen(1) tool when generating the Xymon webpages or reports.
E.g. with this in hosts.cfg:
page USA United States
subpage NY New York
subparent NY manhattan Manhattan data centers
subparent manhattan wallstreet Wall Street center
you get this hierarchy of pages:
USA (United States)
NY (New York)
manhattan (Manhattan data centers)
wallstreet (Wall Street center)
Note: The parent page must be defined before you define the subparent. If not, the page will not be generated, and you get a message in the log file.
Note: xymongen is case-sensitive, when trying to match the name of the parent page.
The inspiration for this came from Craig Cook's mkbb.pl script, and I am grateful to Craig for suggesting that I implement it in xymongen. The idea to explicitly list the parent page in the "subparent" tag was what made it easy to implement.
The title tag operates on the next item in the hosts.cfg file following the title tag.
If a title tag precedes a host entry, the title is shown just before the host is listed on the status page. The column headings present for the host will be repeated just after the heading.
If a title tag precedes a group entry, the title is show just before the group on the status page.
If a title tag precedes a page/subpage/subparent entry, the title text replaces the normal "Pages hosted locally" heading normally inserted by Xymon. This appears on the page that links to the sub-pages, not on the sub-page itself. To get a custom heading on the sub-page, you may want to use the "--pagetext-heading" when running xymongen(1)
NOTE: The "NK" set of tags is deprecated. They will be supported for Xymon 4.x, but will be dropped in version 5. It is recommended that you move your critical systems view to the criticalview.cgi(1) viewer, which has a separate configuration tool, criticaleditor.cgi(1) with more facilities than the NK tags in hosts.cfg.
xymongen will create three sets of pages: The main page xymon.html, the all-non-green-statuses page (nongreen.html), and a specially reduced version of nongreen.html with only selected tests (critical.html). This page includes selected tests that currently have a red or yellow status.
Define the tests that you want included on the critical page. E.g. if you have a host where you only want to see the http tests on critical.html, you specify it as
12.34.56.78 www.acme.com # http://www.acme.com/ NK:http
If you want multiple tests for a host to show up on the critical.html page, specify all the tests separated by commas. The test names correspond to the column names (e.g. https tests are covered by an "NK:http" tag).
By default, tests with a red or yellow status that are listed in the "NK:testname" tag will appear on the NK page. However, you may not want the test to be shown outside of normal working hours - if, for example, the host is not being serviced during week-ends.
You can then use the NKTIME tag to define the time periods where the alert will show up on the NK page.
The time specification consists of
day-of-week: W means Mon-Fri ("weekdays"), * means all days, 0 .. 6 = Sunday .. Saturday. Listing multiple days is possible, e.g. "60" is valid meaning "Saturday and Sunday".
starttime: Time to start showing errors, must be in 24-hour clock format as HHMM hours/minutes. E.g. for 8 am enter "0800", for 9.30 pm enter "2130"
endtime: Time to stop showing errors.
If necessary, multiple periods can be specified. E.g. to monitor a site 24x7, except between noon and 1 pm, use NKTIME=*:0000:1159,*:1300:2359
The interval between start time and end time may cross midnight, e.g. *:2330:0200 would be valid and have the same effect as *:2330:2400,*:0000:0200.
If xymongen is run with the "--wml" option, it will generate a set of WAP-format output "cards" that can be viewed with a WAP-capable device, e.g. a PDA or cell-phone.
The default set of WML tests are taken from the --wml command line option. If no "WML:" tag is specified, the "NK:" tag is used if present.
These tags affect how a status propagates upwards from a single test to the page and higher. This can also be done with the command-line options --nopropyellow and --nopropred, but the tags apply to individual hosts, whereas the command line options are global.
If a host-specific tag begins with a '-' or a '+', the host-specific tags are removed/added to the default setting from the command-line option. If the host-specific tag does not begin with a '+' or a '-', the default setting is ignored for this host and the NOPROPRED applies to the tests given with this tag.
E.g.: xymongen runs with "--nopropred=ftp,smtp". "NOPROPRED:+dns,-smtp" gives a NOPROPRED setting of "ftp,dns" (dns is added to the default, smtp is removed). "NOPROPRED:dns" gives a setting of "dns" only (the default is ignored).
Note: If you set use the "--nopropred=*" command line option to disable propagation of all alerts, you cannot use the "+" and "-" methods to add or remove from the wildcard setting. In that case, do not use the "+" or "-" setting, but simply list the required tests that you want to keep from propagating.
These options affect the way the Xymon availability reports are processed (see report.cgi(1) for details about availability reports).
When xymongen generates a report, it computes the availability of each service - i.e. the percentage of time that the service is reported as available (meaning: not red).
By default, this calculation is done on a 24x7 basis, so no matter when an outage occurs, it counts as downtime.
The REPORTTIME tag allows you to specify a period of time
other than 24x7 for the service availability calculation. If you have
systems where you only guarantee availability from e.g. 7 AM to 8 PM on
weekdays, you can use
REPORTTIME=W:0700:2000
and the availability calculation will only be performed for the service
with measurements from this time interval.
The syntax for REPORTTIME is the same as the one used by the NKTIME parameter.
When REPORTTIME is specified, the availability calculation happens like this:
* Only measurements done during the given time period is used
for the calculation.
* "blue" time reduces the length of the report interval, so if
you are generating a report for a 10-hour period and there are 20
minutes of "blue" time, then the availability calculation will
consider the reporting period to be 580 minutes (10 hours minus 20
minutes). This allows you to have scheduled downtime during the
REPORTTIME interval without hurting your availability; this is (I
believe) the whole idea of the downtime being "planned".
* "red" and "clear" status counts as downtime;
"yellow" and "green" count as uptime.
"purple" time is ignored.
The availability calculation correctly handles status changes that cross into/out of a REPORTTIME interval.
If no REPORTTIME is given, the standard 24x7 calculation is used.
This option allows you to set the threshold value on a host-by-host basis, instead of using a global setting for all hosts. The threshold is defined as the percentage of the time that the host must be available, e.g. "WARNPCT:98.5" if you want the threshold to be at 98.5%
Note: The "--test-untagged" option modifies this behaviour, see xymonnet(1)
This behaviour can also be implemented on a per-test basis by putting the "~" flag on any network test.
Note that "noclear" also affects whether stale status messages from e.g. a client on the host go purple or clear when the host is down; see the "noclear" description in the "GENERAL PER-HOST OPTIONS" section above.
Some SSL implementations cannot handle SSL handshakes with SNI data, so Xymon by default does not use SNI. This default can be changed with the "--sni" option for xymonnet(1) but can also be managed per host with these tags.
SNI support was added in Xymon 4.3.13, where the default was to use SNI. This was changed in 4.3.14 so SNI support is disabled by default, and the "sni" and "nosni" tags were introduced together with the "--sni" option for xymonnet.
What happens is that if a test fails during the specified time, it is reported with status BLUE instead of red, yellow, or purple. Thus you can still see when the service was unavailable, but alarms will not be triggered and the downtime is not counted in the availability calculations generated by the Xymon reports.
The "columns" and "cause" settings are optional, but both or neither must be specified. "columns" may be a comma-separated list of status columns to which DOWNTIME will apply. The "cause" string will be displayed on the status web page to explain why the system is down.
The syntax for DOWNTIME is the same as the one used by the NKTIME parameter.
This tag works the opposite of the DOWNTIME tag - you use it to specify the periods of the day that the service should be green. Failures OUTSIDE the SLA interval are reported as blue.
depends=(testA:host1/test1,host2/test2)
When deciding the color to report for testA, if either host1/test1 failed or host2/test2 failed, if testA has failed also then the color of testA will be "clear" instead of red or yellow.
Since all tests are actually run before the dependencies are evaluated, you can use any host/test in the dependency - regardless of the actual sequence that the hosts are listed, or the tests run. It is also valid to use tests from the same host that the dependency is for. E.g.
1.2.3.4 foo # http://foo/ webmin depends=(webmin:foo/http)
is valid; if both the http and the webmin tests fail, then webmin will be reported as clear.
Note: The "depends" tag is evaluated by xymonnet while running the network tests. It can therefore only refer to other network tests that are handled by the same server - there is currently no way to use the e.g. the status of locally run tests (disk, cpu, msgs) or network tests from other servers in a dependency definition. Such dependencies are silently ignored.
Normally when a network test fails, the status changes to red
immediately. With a "badTEST:x:y:z" tag this behaviour
changes:
* While "z" or more successive tests fail, the column goes RED.
* While "y" or more successive tests fail, but fewer than
"z", the column goes YELLOW.
* While "x" or more successive tests fail, but fewer than
"y", the column goes CLEAR.
* While fewer than "x" successive tests fail, the column stays
GREEN.
The optional time specification can be used to limit this
"badTEST" setting to a particular time of day, e.g. to require
a longer period of downtime before raising an alarm during out-of-office
hours. The time-specification uses:
* Weekdays: The weekdays this badTEST tag applies, from 0 (Sunday) through
6 (Saturday). Putting "W" here counts as "12345",
i.e. all working days. Putting "*" here counts as all days of
the week, equivalent to "0123456".
* start time and end time are specified using 24-hour clocks, e.g.
"badTEST-W-0900-2000" is valid for working days between 9 AM
(09:00) and 8 PM (20:00).
When using multiple badTEST tags, the LAST one specified with a matching time-spec is used.
Note: The "TEST" is replaced by the name of the test, e.g.
12.34.56.78 www.foo.com # http://www.foo.com/ badhttp:1:2:4
defines a http test that goes "clear" after the first failure, "yellow" after two successive failures, and "red" after four successive failures.
For LDAP tests using URL's, use the option "badldapurl". For the other network tests, use "badftp", "badssh" etc.
These tags affect the behaviour of the xymonnet connectivity test.
The actual name of the tag - "conn" by default - depends on the "--ping=TESTNAME" option for xymonnet, as that decides the testname for the connectivity test.
When multiple IP's are pinged, you can choose if ALL IP's must respond (the "worst" method), or AT LEAST one IP must respond (the "best" setting). All of the IP's are reported in a single "conn" status, whose color is determined from the result of pinging the IP's and the best/worst setting. The default method is "best" - so it will report green if just one of the IP's respond to ping.
The router1,router2,... is a comma-separated list of hosts elsewhere in the hosts.cfg file. You cannot have any spaces in the list - separate hosts with commas.
This tag changes the color reported for a ping check that fails, when one or more of the hosts in the "route" list is also down. A "red" status becomes "yellow" - other colors are unchanged. The status message will include information about the hosts in the router-list that are down, to aid tracking down which router is the root cause of the problem.
Note: Internally, the ping test will still be handled as "failed", and therefore any other tests run for this host will report a status of "clear".
These tests perform a simple network test of a service by connecting to the port and possibly checking that a banner is shown by the server.
How these tests operate are configured in the protocols.cfg(5) configuration file, which controls which port to use for the service, whether to send any data to the service, whether to check for a response from the service etc.
You can modify the behaviour of these tests on a per-test basis by adding one or more modifiers to the test: :NUMBER changes the port number from the default to the one you specify for this test. E.g. to test ssh running on port 8022, specify the test as ssh:8022.
:s makes the test silent, i.e. it does not send any data to the service. E.g. to do a silent test of an smtp server, enter smtp:s.
You can combine these two: ftp:8021:s is valid.
If you must test a service from a multi-homed host (i.e. using a specific source IP-address instead of the one your operating system provides), you can use the modifier "@IPADDRESS" at the end of the test specification, after any other modifiers or port number. "IPADDRESS" must be a valid dotted IP-address (not hostname) which is assigned to the host running the network tests.
The name of the test also determines the column name that the test result will appear with in the Xymon webpages.
By prefixing a test with "!" it becomes a reverse test: Xymon will expect the service NOT to be available, and send a green status if it does NOT respond. If a connection to the service succeeds, the status will go red.
By prefixing a test with "?" errors will be reported with a "clear" status instead of red. This is known as a test for a "dialup" service, and allows you to run tests of hosts that are not always online, without getting alarms while they are off-line.
These tags are used to setup monitoring of DNS servers.
The second form of the test allows you to perform multiple queries of the DNS server, requesting different types of DNS records. The TYPE defines the type of DNS data: A (IP-address), MX (Mail eXchanger), PTR (reverse), CNAME (alias), SOA (Start-Of-Authority), NS (Name Server) are among the more common ones used. The "lookup" is the query. E.g. to lookup the MX records for the "foo.com" domain, you would use "dns=mx:foo.com". Or to lookup the nameservers for the "bar.org" domain, "dns=ns:bar.org". You can list multiple lookups, separated by commas. For the test to end up with a green status, all lookups must succeed.
If only "rpc" is given, the test only verifies that the port mapper is available on the remote host. If you want to check that one or more RPC services are registered with the port mapper, list the names of the desired RPC services after the equals-sign. E.g. for a working NFS server the "mount", "nlockmgr" and "nfs" services must be available; this can be checked with "rpc=mount,nlockmgr,nfs".
This test uses the rpcinfo tool for the actual test; if this tool is not available in the PATH of xymonnet, you must define the RPCINFO environment variable to point at this tool. See xymonserver.cfg(5)
Simple testing of a http URL is done simply by putting the URL into the hosts.cfg file. Note that this only applies to URL's that begin with "http:" or "https:".
The following items describe more advanced forms of http URL's.
xymonnet can be told to use specific dialects, by adding one or more "dialect names" to the URL scheme, i.e. the "http" or "https" in the URL:
* "2", e.g. https2://www.sample.com/ : use only
SSLv2
* "3", e.g. https3://www.sample.com/ : use only SSLv3
* "t", e.g. httpst://www.sample.com/ : use only TLSv1.0
* "a", e.g. httpsa://www.sample.com/ : use only TLSv1.0
* "b", e.g. httpsb://www.sample.com/ : use only TLSv1.1
* "c", e.g. httpsc://www.sample.com/ : use only TLSv1.2
* "m", e.g. httpsm://www.sample.com/ : use only 128-bit ciphers
* "h", e.g. httpsh://www.sample.com/ : use only >128-bit
ciphers
* "10", e.g. http10://www.sample.com/ : use HTTP 1.0
* "11", e.g. http11://www.sample.com/ : use HTTP 1.1
These can be combined where it makes sense, e.g to force TLS1.2 and HTTP 1.0 you would use "httpsc10".
Note that SSLv2 support is disabled in all current OpenSSL releases. TLS version-specific scheme testing requires OpenSSL 1.0.1 or higher.
The reason for this is that it interacts badly with virtual hosts, especially if these are IP-based as is common with https-websites.
Instead the IP-address to connect to can be overridden by specifying it as:
http://www.sample.com=1.2.3.4/index.html
The "=1.2.3.4" will case xymonnet to run the test against the IP-address "1.2.3.4", but still trying to access a virtual website with the name "www.sample.com".
The "=ip.address.of.host" must be the last part of the hostname, so if you need to combine this with e.g. an explicit port number, it should be done as
http://www.sample.com:3128=1.2.3.4/index.html
xymonnet supports the Big Brother syntax for specifying an
HTTP proxy to use when performing http tests. This syntax just joins the
proxy- and the target-URL into one, e.g.
http://webproxy.sample.com:3128/http://www.foo.com/
would be the syntax for testing the www.foo.com website via the proxy
running on "webproxy.sample.com" port 3128.
If the proxy port number is not specified, the default HTTP port number (80) is used.
If your proxy requires authentication, you can specify the
username and password inside the proxy-part of the URL, e.g.
http://fred:Wilma1@webproxy.sample.com:3128/http://www.foo.com/
will authenticate to the proxy using a username of "fred" and a
password of "Wilma1", before requesting the proxy to fetch the
www.foo.com homepage.
Note that it is not possible to test https-sites via a proxy, nor is it possible to use https for connecting to the proxy itself.
If the URL itself includes a semi-colon, this must be escaped as '%3B' to avoid confusion over which semicolon is part of the URL, and which semicolon acts as a delimiter.
The data that must be returned can be specified either as a regular expression (except that <space> is not allowed) or as a message digest (typically using an MD5 sum or SHA-1 hash).
The regex is pre-processed for backslash "\" escape
sequences. So you can really put any character in this string by
escaping it first:
\n Newline (LF, ASCII 10 decimal)
\r Carriage return (CR, ASCII 13 decimal)
\t TAB (ASCII 8 decimal)
\\ Backslash (ASCII 92 decimal)
\XX The character with ASCII hex-value XX
If you must have whitespace in the regex, use the [[:space:]] syntax, e.g. if you want to test for the string "All is OK", use "All[[:space:]]is[[:space:]]OK". Note that this may depend on your particular implementation of the regex functions found in your C library. Thanks to Charles Goyard for this tip.
Note: If you are migrating from the "cont2.sh" script, you must change the '_' used as wildcards by cont2.sh into '.' which is the regular-expression wildcard character.
Message digests can use whatever digest algorithms your libcrypto implementation (usually OpenSSL) supports. Common message digests are "md5", "sha1", "sha256" or "sha512". The digest is calculated on the data portion of the response from the server, i.e. HTTP headers are not included in the digest (as they change from one request to the next).
The expected digest value can be computed with the xymondigest(1) utility.
"cont" tags in hosts.cfg result in two status reports: One status with the "http" check, and another with the "content" check.
As with normal URL's, the extended syntax described above can be used e.g. when testing SSL sites that require the use of SSLv2 or strong ciphers.
The column name for the result of the content check is by default called "content" - you can change the default with the "--content=NAME" option to xymonnet. See xymonnet(1) for a description of this option.
If more than one content check is present for a host, the first content check is reported in the column "content", the second is reported in the column "content1", the third in "content2" etc.
You can also specify the column name directly in the test specification, by writing it as "cont=COLUMN;http://...". Column-names cannot include whitespace or semi-colon.
The content-check status by default includes the full URL that was requested, and the HTML data returned by the server. You can hide the HTML data on a per-host (not per-test) basis by adding the HIDEHTTP tag to the host entry.
The form-data field must be entered in "application/x-www-form-urlencoded" format, which is the most commonly used format for web forms.
E.g. if you have a web form defined like this:
<form action="/cgi-bin/form.cgi" method="post">
<p>Given name<input type="text"
name="givenname"></p>
<p>Surname<input type="text"
name="surname"></p>
<input type="submit" value="Send">
</form>
and you want to post the value "John" to the first field and "Doe Jr." to the second field, then the form data field would be
givenname=John&surname=Doe+Jr.
Note that any spaces in the input value is replaced with '+'.
If your form-data requires a different content-type, you can specify it by beginning the form-data with (content-type=TYPE), e.g. "(content-type=text/xml)" followed by the POST data. Note that as with normal forms, the POST data should be specified using escape-sequences for reserved characters: "space" should be entered as "\x20", double quote as "\x22", newline as "\n", carriage-return as "\r", TAB as "\t", backslash as "\\". Any byte value can be entered using "\xNN" with NN being the hexadecimal value, e.g. "\x20" is the space character.
The [expected_data_regexp|#digesttype:digest] is the expected data returned from the server in response to the POST. See the "cont;" tag above for details. If you are only interested in knowing if it is possible to submit the form (but don't care about the data), this can be an empty string - but the ';' at the end is required.
Note that SOAP XML documents usually must begin with the XML version line, <?xml version="1.0">
hostport is a host name with an optional ":portnumber"
dn is the search base
attrs is a comma separated list of attributes to request
scope is one of these three strings:
base one sub (default=base)
filter is filter
exts are recognized set of LDAP and/or API extensions.
Note that you need to enable the server-status URL in your Apache configuration. The following configuration is needed:
<Location /server-status>
SetHandler server-status
Order deny,allow
Deny from all
allow from 127.0.0.1
</Location>
ExtendedStatus On
Change "127.0.0.1" to the IP-address of the server that runs your network tests.
If you have certain tags that you want to apply to all hosts, you can define a host name ".default." and put the tags on that host. Note that per-host definitions will override the default ones. To apply to all hosts this should be listed FIRST in your file.
NOTE: The ".default." host entry will only accept the following tags - others are silently ignored: delayyellow, delayred, NOCOLUMNS, COMMENT, DESCR, CLASS, dialup, testip, nonongreen, nodisp, noinfo, notrends, noclient, TRENDS, NOPROPRED, NOPROPYELLOW, NOPROPPURPLE, NOPROPACK, REPORTTIME, WARNPCT, NET, noclear, nosslcert, ssldays, DOWNTIME, depends, noping, noconn, trace, notrace, HIDEHTTP, browser, pulldata. Specifically, note that network tests, "badTEST" settings, and alternate pageset relations cannot be listed on the ".default." host.
Multiple "summary" definitions are allowed.
The ROW.COLUMN setting defines how this summary is presented on the server that receives the summary. The ROW text will be used as the heading for a summary line, and the COLUMN defines the name of the column where this summary is shown - like the hostname and testname used in the normal displays. The IP is the IP-address of the remote (upstream) Xymon server, where this summary is sent). The URL is the URL of your local Xymon server.
The URL need not be that of your Xymon server's main page - it could be the URL of a sub-page on the local Xymon server. Xymon will report the summary using the color of the page found at the URL you specify. E.g. on your corporate Xymon server you want a summary from the Las Vegas office - but you would like to know both what the overall status is, and what is the status of the servers on the critical Sales department back-office servers in Las Vegas. So you configure the Las Vegas Xymon server to send two summaries:
summary Vegas.All 10.0.1.1 http://vegas.foo.com/xymon/
summary Vegas.Sales 10.0.1.1 http://vegas.foo.com/xymon/sales/
This gives you one summary line for Baltimore, with two columns: An "All" column showing the overall status, and a "Sales" column showing the status of the "sales" page on the Baltimore Xymon server.
Note: Pages defined using alternate pageset definitions cannot be used, the URL must point to a web page from the default set of Xymon webpages.
~xymon/server/etc/hosts.cfg
xymongen(1), xymonnet(1), xymondigest(1), xymonserver.cfg(5), xymon(7)
Version 4.3.28: 17 Jan 2017 | Xymon |