COLLECTD.CONF(5) | collectd | COLLECTD.CONF(5) |
collectd.conf - Configuration for the system statistics collection daemon collectd
BaseDir "/var/lib/collectd" PIDFile "/run/collectd.pid" Interval 10.0 LoadPlugin cpu LoadPlugin load <LoadPlugin df> Interval 3600 </LoadPlugin> <Plugin df> ValuesPercentage true </Plugin> LoadPlugin ping <Plugin ping> Host "example.org" Host "provider.net" </Plugin>
This config file controls how the system statistics collection daemon collectd behaves. The most significant option is LoadPlugin, which controls which plugins to load. These plugins ultimately define collectd's behavior. If the AutoLoadPlugin option has been enabled, the explicit LoadPlugin lines may be omitted for all plugins with a configuration block, i.e. a "<Plugin ...>" block.
The syntax of this config file is similar to the config file of the famous Apache webserver. Each line contains either an option (a key and a list of one or more values) or a section-start or -end. Empty lines and everything after a non-quoted hash-symbol ("#") are ignored. Keys are unquoted strings, consisting only of alphanumeric characters and the underscore ("_") character. Keys are handled case insensitive by collectd itself and all plugins included with it. Values can either be an unquoted string, a quoted string (enclosed in double-quotes) a number or a boolean expression. Unquoted strings consist of only alphanumeric characters and underscores ("_") and do not need to be quoted. Quoted strings are enclosed in double quotes ("""). You can use the backslash character ("\") to include double quotes as part of the string. Numbers can be specified in decimal and floating point format (using a dot "." as decimal separator), hexadecimal when using the "0x" prefix and octal with a leading zero (0). Boolean values are either true or false.
Lines may be wrapped by using "\" as the last character before the newline. This allows long lines to be split into multiple lines. Quoted strings may be wrapped as well. However, those are treated special in that whitespace at the beginning of the following lines will be ignored, which allows for nicely indenting the wrapped lines.
The configuration is read and processed in order, i.e. from top to bottom. So the plugins are loaded in the order listed in this config file. It is a good idea to load any logging plugins first in order to catch messages from plugins during configuration. Also, unless AutoLoadPlugin is enabled, the LoadPlugin option must occur before the appropriate "<Plugin ...>" block.
Only the first LoadPlugin statement or block for a given plugin name has any effect. This is useful when you want to split up the configuration into smaller files and want each file to be "self contained", i.e. it contains a Plugin block and the appropriate LoadPlugin statement. The downside is that if you have multiple conflicting LoadPlugin blocks, e.g. when they specify different intervals, only one of them (the first one encountered) will take effect and all others will be silently ignored.
LoadPlugin may either be a simple configuration statement or a block with additional options, affecting the behavior of LoadPlugin. A simple statement looks like this:
LoadPlugin "cpu"
Options inside a LoadPlugin block can override default settings and influence the way plugins are loaded, e.g.:
<LoadPlugin perl> Interval 60 </LoadPlugin>
The following options are valid inside LoadPlugin blocks:
This is useful (or possibly even required), e.g., when loading a plugin that embeds some scripting language into the daemon (e.g. the Perl and Python plugins). Scripting languages usually provide means to load extensions written in C. Those extensions require symbols provided by the interpreter, which is loaded as a dependency of the respective collectd plugin. See the documentation of those plugins (e.g., collectd-perl(5) or collectd-python(5)) for details.
By default, this is disabled. As a special exception, if the plugin name is either "perl" or "python", the default is changed to enabled in order to keep the average user from ever having to deal with this low level linking stuff.
When set to true, explicit LoadPlugin statements are not required. Each <Plugin ...> block acts as if it was immediately preceded by a LoadPlugin statement. LoadPlugin statements are still required for plugins that don't provide any configuration, e.g. the Load plugin.
The following metrics are reported:
Include "/etc/collectd.d/*.conf"
Starting with version 5.3, this may also be a block in which further options affecting the behavior of Include may be specified. The following option is currently allowed:
<Include "/etc/collectd.d"> Filter "*.conf" </Include>
If more than one file is included by a single Include option, the files will be included in lexicographical order (as defined by the "strcmp" function). Thus, you can e. g. use numbered prefixes to specify the order in which the files are loaded.
To prevent loops and shooting yourself in the foot in interesting ways the nesting is limited to a depth of 8 levels, which should be sufficient for most uses. Since symlinks are followed it is still possible to crash the daemon by looping symlinks. In our opinion significant stupidity should result in an appropriate amount of pain.
It is no problem to have a block like "<Plugin foo>" in more than one file, but you cannot include files from within blocks.
If this option is not specified, a default file is read. If you need to define custom types in addition to the types defined in the default file, you need to explicitly load both. In other words, if the TypesDB option is encountered the default behavior is disabled and if you need the default types you have to also explicitly load them.
Warning: You should set this once and then never touch it again. If you do, you will have to delete all your RRD files or know some serious RRDtool magic! (Assuming you're using the RRDtool or RRDCacheD plugin.)
This options limits the maximum value of the interval. The default value is 86400.
By default, there is no limit and memory may grow indefinitely. This is most likely not an issue for clients, i.e. instances that only handle the local metrics. For servers it is recommended to set this to a non-zero value, though.
You can set the limits using WriteQueueLimitHigh and WriteQueueLimitLow. Each of them takes a numerical argument which is the number of metrics in the queue. If there are HighNum metrics in the queue, any new metrics will be dropped. If there are less than LowNum metrics in the queue, all new metrics will be enqueued. If the number of metrics currently in the queue is between LowNum and HighNum, the metric is dropped with a probability that is proportional to the number of metrics in the queue (i.e. it increases linearly until it reaches 100%.)
If WriteQueueLimitHigh is set to non-zero and WriteQueueLimitLow is unset, the latter will default to half of WriteQueueLimitHigh.
If you do not want to randomly drop values when the queue size is between LowNum and HighNum, set WriteQueueLimitHigh and WriteQueueLimitLow to the same value.
Enabling the CollectInternalStats option is of great help to figure out the values to set WriteQueueLimitHigh and WriteQueueLimitLow to.
Some plugins may register own options. These options must be enclosed in a "Plugin"-Section. Which options exist depends on the plugin used. Some plugins require external configuration, too. The "apache plugin", for example, required "mod_status" to be configured in the webserver you're going to collect data from. These plugins are listed below as well, even if they don't require any configuration within collectd's configuration file.
A list of all plugins and a short summary for each plugin can be found in the README file shipped with the sourcecode and hopefully binary packets as well.
The Aggregation plugin makes it possible to aggregate several values into one using aggregation functions such as sum, average, min and max. This can be put to a wide variety of uses, e.g. average and total CPU statistics for your entire fleet.
The grouping is powerful but, as with many powerful tools, may be a bit difficult to wrap your head around. The grouping will therefore be demonstrated using an example: The average and sum of the CPU usage across all CPUs of each host is to be calculated.
To select all the affected values for our example, set "Plugin cpu" and "Type cpu". The other values are left unspecified, meaning "all values". The Host, Plugin, PluginInstance, Type and TypeInstance options work as if they were specified in the "WHERE" clause of an "SELECT" SQL statement.
Plugin "cpu" Type "cpu"
Although the Host, PluginInstance (CPU number, i.e. 0, 1, 2, ...) and TypeInstance (idle, user, system, ...) fields are left unspecified in the example, the intention is to have a new value for each host / type instance pair. This is achieved by "grouping" the values using the "GroupBy" option. It can be specified multiple times to group by more than one field.
GroupBy "Host" GroupBy "TypeInstance"
We do neither specify nor group by plugin instance (the CPU number), so all metrics that differ in the CPU number only will be aggregated. Each aggregation needs at least one such field, otherwise no aggregation would take place.
The full example configuration looks like this:
<Plugin "aggregation"> <Aggregation> Plugin "cpu" Type "cpu" GroupBy "Host" GroupBy "TypeInstance" CalculateSum true CalculateAverage true </Aggregation> </Plugin>
There are a couple of limitations you should be aware of:
As you can see in the example above, each aggregation has its own Aggregation block. You can have multiple aggregation blocks and aggregation blocks may match the same values, i.e. one value list can update multiple aggregations. The following options are valid inside Aggregation blocks:
If the string starts with and ends with a slash ("/"), the string is interpreted as a regular expression. The regex flavor used are POSIX extended regular expressions as described in regex(7). Example usage:
Host "/^db[0-9]\\.example\\.com$/"
The PluginInstance should include the placeholder "%{aggregation}" which will be replaced with the aggregation function, e.g. "average". Not including the placeholder will result in duplication warnings and/or messed up values if more than one aggregation function are enabled.
The following example calculates the average usage of all "even" CPUs:
<Plugin "aggregation"> <Aggregation> Plugin "cpu" PluginInstance "/[0,2,4,6,8]$/" Type "cpu" SetPlugin "cpu" SetPluginInstance "even-%{aggregation}" GroupBy "Host" GroupBy "TypeInstance" CalculateAverage true </Aggregation> </Plugin>
This will create the files:
The AMQP plugin can be used to communicate with other instances of collectd or third party applications using an AMQP message broker. Values are sent to or received from the broker, which handles routing, queueing and possibly filtering out messages.
Synopsis:
<Plugin "amqp"> # Send values to an AMQP broker <Publish "some_name"> Host "localhost" Port "5672" VHost "/" User "guest" Password "guest" Exchange "amq.fanout" # ExchangeType "fanout" # RoutingKey "collectd" # Persistent false # ConnectionRetryDelay 0 # Format "command" # StoreRates false # GraphitePrefix "collectd." # GraphiteEscapeChar "_" # GraphiteSeparateInstances false # GraphiteAlwaysAppendDS false # GraphitePreserveSeparator false </Publish> # Receive values from an AMQP broker <Subscribe "some_name"> Host "localhost" Port "5672" VHost "/" User "guest" Password "guest" Exchange "amq.fanout" # ExchangeType "fanout" # Queue "queue_name" # QueueDurable false # QueueAutoDelete true # RoutingKey "collectd.#" # ConnectionRetryDelay 0 </Subscribe> </Plugin>
The plugin's configuration consists of a number of Publish and Subscribe blocks, which configure sending and receiving of values respectively. The two blocks are very similar, so unless otherwise noted, an option can be used in either block. The name given in the blocks starting tag is only used for reporting messages, but may be used to support flushing of certain Publish blocks in the future.
In Subscribe blocks this option is optional. If given, a binding between the given exchange and the queue is created, using the routing key if configured. See the Queue and RoutingKey options below.
This option should be used in conjunction with the Persistent option on the publish side.
In Subscribe blocks, configures the routing key used when creating a binding between an exchange and the queue. The usual wildcards can be used to filter messages when using a "topic" exchange. If you're only interested in CPU statistics, you could use the routing key "collectd.*.cpu.#" for example.
If set to JSON, the values are encoded in the JavaScript Object Notation, an easy and straight forward exchange format. The "Content-Type" header field will be set to "application/json".
If set to Graphite, values are encoded in the Graphite format, which is "<metric> <value> <timestamp>\n". The "Content-Type" header field will be set to "text/graphite".
A subscribing client should use the "Content-Type" header field to determine how to decode the values. Currently, the AMQP plugin itself can only decode the Command format.
Please note that currently this option is only used if the Format option has been set to JSON.
To configure the "apache"-plugin you first need to configure the Apache webserver correctly. The Apache-plugin "mod_status" needs to be loaded and working and the "ExtendedStatus" directive needs to be enabled. You can use the following snipped to base your Apache config upon:
ExtendedStatus on <IfModule mod_status.c> <Location /mod_status> SetHandler server-status </Location> </IfModule>
Since its "mod_status" module is very similar to Apache's, lighttpd is also supported. It introduces a new field, called "BusyServers", to count the number of currently connected clients. This field is also supported.
The configuration of the Apache plugin consists of one or more "<Instance />" blocks. Each block requires one string argument as the instance name. For example:
<Plugin "apache"> <Instance "www1"> URL "http://www1.example.com/mod_status?auto" </Instance> <Instance "www2"> URL "http://www2.example.com/mod_status?auto" </Instance> </Plugin>
The instance name will be used as the plugin instance. To emulate the old (version 4) behavior, you can use an empty string (""). In order for the plugin to work correctly, each instance name must be unique. This is not enforced by the plugin and it is your responsibility to ensure it.
The following options are accepted within each Instance block:
You can instruct the plugin to close the connection after each read by setting this option to false or force keeping the connection by setting it to true.
If apcupsd appears to close the connection due to inactivity quite quickly, the plugin will try to detect this problem and switch to an open-read-close mode.
This plugin collects the value of the available sensors in an Aquaero 5 board. Aquaero 5 is a water-cooling controller board, manufactured by Aqua Computer GmbH <http://www.aquacomputer.de/>, with a USB2 connection for monitoring and configuration. The board can handle multiple temperature sensors, fans, water pumps and water level sensors and adjust the output settings such as fan voltage or power used by the water pump based on the available inputs using a configurable controller included in the board. This plugin collects all the available inputs as well as some of the output values chosen by this controller. The plugin is based on the libaquaero5 library provided by aquatools-ng.
This plugin collects information about an Ascent server, a free server for the "World of Warcraft" game. This plugin gathers the information by fetching the XML status page using "libcurl" and parses it using "libxml2".
The configuration options are the same as for the "apache" plugin above:
This plugin reads absolute air pressure using digital barometer sensor on a I2C bus. Supported sensors are:
The sensor type - one of the above - is detected automatically by the plugin and indicated in the plugin_instance (you will see subdirectory "barometer-mpl115" or "barometer-mpl3115", or "barometer-bmp085"). The order of detection is BMP085 -> MPL3115 -> MPL115A2, the first one found will be used (only one sensor can be used by the plugin).
The plugin provides absolute barometric pressure, air pressure reduced to sea level (several possible approximations) and as an auxiliary value also internal sensor temperature. It uses (expects/provides) typical metric units - pressure in [hPa], temperature in [C], altitude in [m].
It was developed and tested under Linux only. The only platform dependency is the standard Linux i2c-dev interface (the particular bus driver has to support the SM Bus command subset).
The reduction or normalization to mean sea level pressure requires (depending on selected method/approximation) also altitude and reference to temperature sensor(s). When multiple temperature sensors are configured the minimum of their values is always used (expecting that the warmer ones are affected by e.g. direct sun light at that moment).
Synopsis:
<Plugin "barometer"> Device "/dev/i2c-0"; Oversampling 512 PressureOffset 0.0 TemperatureOffset 0.0 Normalization 2 Altitude 238.0 TemperatureSensor "myserver/onewire-F10FCA000800/temperature" </Plugin>
Device name of the I2C bus to which the sensor is connected. Note that typically you need to have loaded the i2c-dev module. Using i2c-tools you can check/list i2c buses available on your system by:
i2cdetect -l
Then you can scan for devices on given bus. E.g. to scan the whole bus 0 use:
i2cdetect -y -a 0
This way you should be able to verify that the pressure sensor (either type) is connected and detected on address 0x60.
For MPL115 this is the size of the averaging window. To filter out sensor noise a simple averaging using floating window of this configurable size is used. The plugin will use average of the last "value" measurements (value of 1 means no averaging). Minimal size is 1, maximal 1024.
For MPL3115 this is the oversampling value. The actual oversampling is performed by the sensor and the higher value the higher accuracy and longer conversion time (although nothing to worry about in the collectd context). Supported values are: 1, 2, 4, 8, 16, 32, 64 and 128. Any other value is adjusted by the plugin to the closest supported one.
For BMP085 this is the oversampling value. The actual oversampling is performed by the sensor and the higher value the higher accuracy and longer conversion time (although nothing to worry about in the collectd context). Supported values are: 1, 2, 4, 8. Any other value is adjusted by the plugin to the closest supported one.
You can further calibrate the sensor by supplying pressure and/or temperature offsets. This is added to the measured/caclulated value (i.e. if the measured value is too high then use negative offset). In hPa, default is 0.0.
You can further calibrate the sensor by supplying pressure and/or temperature offsets. This is added to the measured/caclulated value (i.e. if the measured value is too high then use negative offset). In C, default is 0.0.
Normalization method - what approximation/model is used to compute the mean sea level pressure from the air absolute pressure.
Supported values of the "method" (integer between from 0 to 2) are:
The battery plugin reports the remaining capacity, power and voltage of laptop batteries.
When this option is set to false, the default, the battery plugin will only report the remaining capacity. If the ValuesPercentage option is enabled, the relative remaining capacity is calculated as the ratio of the "remaining capacity" and the "last full capacity". This is what most tools, such as the status bar of desktop environments, also do.
When set to true, the battery plugin will report three values: charged (remaining capacity), discharged (difference between "last full capacity" and "remaining capacity") and degraded (difference between "design capacity" and "last full capacity").
Starting with BIND 9.5.0, the most widely used DNS server software provides extensive statistics about queries, responses and lots of other information. The bind plugin retrieves this information that's encoded in XML and provided via HTTP and submits the values to collectd.
To use this plugin, you first need to tell BIND to make this information available. This is done with the "statistics-channels" configuration option:
statistics-channels { inet localhost port 8053; };
The configuration follows the grouping that can be seen when looking at the data with an XSLT compatible viewer, such as a modern web browser. It's probably a good idea to make yourself familiar with the provided values, so you can understand what the collected statistics actually mean.
Synopsis:
<Plugin "bind"> URL "http://localhost:8053/" ParseTime false OpCodes true QTypes true ServerStats true ZoneMaintStats true ResolverStats false MemoryStats true <View "_default"> QTypes true ResolverStats true CacheRRSets true Zone "127.in-addr.arpa/IN" </View> </Plugin>
The bind plugin accepts the following configuration options:
This setting is set to true by default for backwards compatibility; setting this to false is recommended to avoid problems with timezones and localization.
Default: Enabled.
Default: Enabled.
Default: Enabled.
Default: Enabled.
Default: Disabled.
Default: Enabled.
Within a <View name> block, you can specify which information you want to collect about a view. If no View block is configured, no detailed view statistics will be collected.
Default: Enabled.
Default: Enabled.
Default: Enabled.
You can repeat this option to collect detailed information about multiple zones.
By default no detailed zone information is collected.
The ceph plugin collects values from JSON data to be parsed by libyajl (<https://lloyd.github.io/yajl/>) retrieved from ceph daemon admin sockets.
A separate Daemon block must be configured for each ceph daemon to be monitored. The following example will read daemon statistics from four separate ceph daemons running on the same device (two OSDs, one MON, one MDS) :
<Plugin ceph> LongRunAvgLatency false ConvertSpecialMetricTypes true <Daemon "osd.0"> SocketPath "/var/run/ceph/ceph-osd.0.asok" </Daemon> <Daemon "osd.1"> SocketPath "/var/run/ceph/ceph-osd.1.asok" </Daemon> <Daemon "mon.a"> SocketPath "/var/run/ceph/ceph-mon.ceph1.asok" </Daemon> <Daemon "mds.a"> SocketPath "/var/run/ceph/ceph-mds.ceph1.asok" </Daemon> </Plugin>
The ceph plugin accepts the following configuration options:
Default: Disabled
Default: Enabled
Each Daemon block must have a string argument for the plugin instance name. A SocketPath is also required for each Daemon block:
This plugin collects the CPU user/system time for each cgroup by reading the cpuacct.stat files in the first cpuacct-mountpoint (typically /sys/fs/cgroup/cpu.cpuacct on machines using systemd).
See /"IGNORELISTS" for details.
The "chrony" plugin collects ntp data from a chronyd server, such as clock skew and per-peer stratum.
For talking to chronyd, it mimics what the chronyc control program does on the wire.
Available configuration options for the "chrony" plugin:
This plugin collects IP conntrack statistics.
The CPU plugin collects CPU usage metrics. By default, CPU usage is reported as Jiffies, using the "cpu" type. Two aggregations are available:
The two aggregations can be combined, leading to collectd only emitting a single "active" metric for the entire system. As soon as one of these aggregations (or both) is enabled, the cpu plugin will report a percentage, rather than Jiffies. In addition, you can request individual, per-state, per-CPU metrics to be reported as percentage.
The following configuration options are available:
This plugin doesn't have any options. It reads /sys/devices/system/cpu/cpu0/cpufreq/scaling_cur_freq (for the first CPU installed) to get the current CPU frequency. If this file does not exist make sure cpufreqd (<http://cpufreqd.sourceforge.net/>) or a similar tool is installed and an "cpu governor" (that's a kernel module) is loaded.
This plugin doesn't have any options. It reads CLOCK_BOOTTIME and CLOCK_MONOTONIC and reports the difference between these clocks. Since BOOTTIME clock increments while device is suspended and MONOTONIC clock does not, the derivative of the difference between these clocks gives the relative amount of time the device has spent in suspend state. The recorded value is in milliseconds of sleep per seconds of wall clock.
All cURL-based plugins support collection of generic, request-based statistics. These are disabled by default and can be enabled selectively for each page or URL queried from the curl, curl_json, or curl_xml plugins. See the documentation of those plugins for specific information. This section describes the available metrics that can be configured for each plugin. All options are disabled by default.
See <http://curl.haxx.se/libcurl/c/curl_easy_getinfo.html> for more details.
The curl plugin uses the libcurl (<http://curl.haxx.se/>) to read web pages and the match infrastructure (the same code used by the tail plugin) to use regular expressions with the received data.
The following example will read the current value of AMD stock from Google's finance page and dispatch the value to collectd.
<Plugin curl> <Page "stock_quotes"> Plugin "quotes" URL "http://finance.google.com/finance?q=NYSE%3AAMD" User "foo" Password "bar" Digest false VerifyPeer true VerifyHost true CACert "/path/to/ca.crt" Header "X-Custom-Header: foobar" Post "foo=bar" MeasureResponseTime false MeasureResponseCode false <Match> Regex "<span +class=\"pr\"[^>]*> *([0-9]*\\.[0-9]+) *</span>" DSType "GaugeAverage" # Note: `stock_value' is not a standard type. Type "stock_value" Instance "AMD" </Match> </Page> </Plugin>
In the Plugin block, there may be one or more Page blocks, each defining a web page and one or more "matches" to be performed on the returned data. The string argument to the Page block is used as plugin instance.
The following options are valid within Page blocks:
Beware that requests will get aborted if they take too long to complete. Adjust Timeout accordingly if you expect MeasureResponseTime to report such slow requests.
This option is similar to enabling the TotalTime statistic but it's measured by collectd instead of cURL.
If Timeout is 0 or bigger than the Interval, keep in mind that each slow network connection will stall one read thread. Adjust the ReadThreads global setting accordingly to prevent this from blocking other plugins.
The curl_json plugin collects values from JSON data to be parsed by libyajl (<https://lloyd.github.io/yajl/>) retrieved via either libcurl (<http://curl.haxx.se/>) or read directly from a unix socket. The former can be used, for example, to collect values from CouchDB documents (which are stored JSON notation), and the latter to collect values from a uWSGI stats socket.
The following example will collect several values from the built-in "_stats" runtime statistics module of CouchDB (<http://wiki.apache.org/couchdb/Runtime_Statistics>).
<Plugin curl_json> <URL "http://localhost:5984/_stats"> Instance "httpd" <Key "httpd/requests/count"> Type "http_requests" </Key> <Key "httpd_request_methods/*/count"> Type "http_request_methods" </Key> <Key "httpd_status_codes/*/count"> Type "http_response_codes" </Key> </URL> </Plugin>
This example will collect data directly from a uWSGI "Stats Server" socket.
<Plugin curl_json> <Sock "/var/run/uwsgi.stats.sock"> Instance "uwsgi" <Key "workers/*/requests"> Type "http_requests" </Key> <Key "workers/*/apps/*/requests"> Type "http_requests" </Key> </Sock> </Plugin>
In the Plugin block, there may be one or more URL blocks, each defining a URL to be fetched via HTTP (using libcurl) or Sock blocks defining a unix socket to read JSON from directly. Each of these blocks may have one or more Key blocks.
The Key string argument must be in a path format. Each component is used to match the key from a JSON map or the index of an JSON array. If a path component of a Key is a * wildcard, the values for all map keys or array indices will be collectd.
The following options are valid within URL blocks:
The following options are valid within Key blocks:
The curl_xml plugin uses libcurl (<http://curl.haxx.se/>) and libxml2 (<http://xmlsoft.org/>) to retrieve XML data via cURL.
<Plugin "curl_xml"> <URL "http://localhost/stats.xml"> Host "my_host" #Plugin "curl_xml" Instance "some_instance" User "collectd" Password "thaiNg0I" VerifyPeer true VerifyHost true CACert "/path/to/ca.crt" Header "X-Custom-Header: foobar" Post "foo=bar" <XPath "table[@id=\"magic_level\"]/tr"> Type "magic_level" #InstancePrefix "prefix-" InstanceFrom "td[1]" #PluginInstanceFrom "td[1]" ValuesFrom "td[2]/span[@class=\"level\"]" </XPath> </URL> </Plugin>
In the Plugin block, there may be one or more URL blocks, each defining a URL to be fetched using libcurl. Within each URL block there are options which specify the connection parameters, for example authentication information, and one or more XPath blocks.
Each XPath block specifies how to get one type of information. The string argument must be a valid XPath expression which returns a list of "base elements". One value is dispatched for each "base element". The type instance and values are looked up using further XPath expressions that should be relative to the base element.
Within the URL block the following options are accepted:
Examples:
Namespace "s" "http://schemas.xmlsoap.org/soap/envelope/" Namespace "m" "http://www.w3.org/1998/Math/MathML"
Within the XPath block the following options are accepted:
If the "base XPath expression" (the argument to the XPath block) returns exactly one argument, then InstanceFrom and PluginInstanceFrom may be omitted. Otherwise, at least one of InstanceFrom or PluginInstanceFrom is required.
This plugin uses the dbi library (<http://libdbi.sourceforge.net/>) to connect to various databases, execute SQL statements and read back the results. dbi is an acronym for "database interface" in case you were wondering about the name. You can configure how each column is to be interpreted and the plugin will generate one or more data sets from each row returned according to these rules.
Because the plugin is very generic, the configuration is a little more complex than those of other plugins. It usually looks something like this:
<Plugin dbi> <Query "out_of_stock"> Statement "SELECT category, COUNT(*) AS value FROM products WHERE in_stock = 0 GROUP BY category" # Use with MySQL 5.0.0 or later MinVersion 50000 <Result> Type "gauge" InstancePrefix "out_of_stock" InstancesFrom "category" ValuesFrom "value" </Result> </Query> <Database "product_information"> #Plugin "warehouse" Driver "mysql" Interval 120 DriverOption "host" "localhost" DriverOption "username" "collectd" DriverOption "password" "aZo6daiw" DriverOption "dbname" "prod_info" SelectDB "prod_info" Query "out_of_stock" </Database> </Plugin>
The configuration above defines one query with one result and one database. The query is then linked to the database with the Query option within the <Database> block. You can have any number of queries and databases and you can also use the Include statement to split up the configuration file in multiple, smaller files. However, the <Query> block must precede the <Database> blocks, because the file is interpreted from top to bottom!
The following is a complete list of options:
Query blocks
Query blocks define SQL statements and how the returned data should be interpreted. They are identified by the name that is given in the opening line of the block. Thus the name needs to be unique. Other than that, the name is not used in collectd.
In each Query block, there is one or more Result blocks. Result blocks define which column holds which value or instance information. You can use multiple Result blocks to create multiple values from one returned row. This is especially useful, when queries take a long time and sending almost the same query again and again is not desirable.
Example:
<Query "environment"> Statement "select station, temperature, humidity from environment" <Result> Type "temperature" # InstancePrefix "foo" InstancesFrom "station" ValuesFrom "temperature" </Result> <Result> Type "humidity" InstancesFrom "station" ValuesFrom "humidity" </Result> </Query>
The following options are accepted:
The query has to return at least two columns, one for the instance and one value. You cannot omit the instance, even if the statement is guaranteed to always return exactly one line. In that case, you can usually specify something like this:
Statement "SELECT \"instance\", COUNT(*) AS value FROM table"
(That works with MySQL but may not be valid SQL according to the spec. If you use a more strict database server, you may have to select from a dummy table or something.)
Please note that some databases, for example Oracle, will fail if you include a semicolon at the end of the statement.
The database version is determined by "dbi_conn_get_engine_version", see the libdbi documentation <http://libdbi.sourceforge.net/docs/programmers-guide/reference-conn.html#DBI-CONN-GET-ENGINE-VERSION> for details. Basically, each part of the version is assumed to be in the range from 00 to 99 and all dots are removed. So version "4.1.2" becomes "40102", version "5.0.42" becomes "50042".
Warning: The plugin will use all matching queries, so if you specify multiple queries with the same name and overlapping ranges, weird stuff will happen. Don't to it! A valid example would be something along these lines:
MinVersion 40000 MaxVersion 49999 ... MinVersion 50000 MaxVersion 50099 ... MinVersion 50100 # No maximum
In the above example, there are three ranges that don't overlap. The last one goes from version "5.1.0" to infinity, meaning "all later versions". Versions before "4.0.0" are not specified.
If you specify "temperature" here, you need exactly one gauge column. If you specify "if_octets", you will need two counter columns. See the ValuesFrom setting below.
There must be exactly one Type option inside each Result block.
The plugin itself does not check whether or not all built instances are different. It's your responsibility to assure that each is unique. This is especially true, if you do not specify InstancesFrom: You have to make sure that only one row is returned in this case.
If neither InstancePrefix nor InstancesFrom is given, the type-instance will be empty.
The actual data type in the columns is not that important. The plugin will automatically cast the values to the right type if it know how to do that. So it should be able to handle integer an floating point types, as well as strings (if they include a number at the beginning).
There must be at least one ValuesFrom option inside each Result block.
The actual data type in the columns is not that important. The plugin will automatically cast the values to the right type if it know how to do that. So it should be able to handle integer an floating point types, as well as strings (if they include a number at the beginning).
Database blocks
Database blocks define a connection to a database and which queries should be sent to that database. Since the used "dbi" library can handle a wide variety of databases, the configuration is very generic. If in doubt, refer to libdbi's documentation - we stick as close to the terminology used there.
Each database needs a "name" as string argument in the starting tag of the block. This name will be used as "PluginInstance" in the values submitted to the daemon. Other than that, that name is not used.
You need to give the driver name as expected by the "dbi" library here. You should be able to find that in the documentation for each driver. If you mistype the driver name, the plugin will dump a list of all known driver names to the log.
DBDs can register two types of options: String options and numeric options. The plugin will use the "dbi_conn_set_option" function when the configuration provides a string and the "dbi_conn_require_option_numeric" function when the configuration provides a number. So these two lines will actually result in different calls being used:
DriverOption "Port" 1234 # numeric DriverOption "Port" "1234" # string
Unfortunately, drivers are not too keen to report errors when an unknown option is passed to them, so invalid settings here may go unnoticed. This is not the plugin's fault, it will report errors if it gets them from the library / the driver. If a driver complains about an option, the plugin will dump a complete list of all options understood by that driver to the log. There is no way to programmatically find out if an option expects a string or a numeric argument, so you will have to refer to the appropriate DBD's documentation to find this out. Sorry.
See /"IGNORELISTS" for details.
See /"IGNORELISTS" for details.
See /"IGNORELISTS" for details.
Enable this option if inodes are a scarce resource for you, usually because many small files are stored on the disk. This is a usual scenario for mail transfer agents and web caches.
This is useful for deploying collectd on the cloud, where machines with different disk size may exist. Then it is more practical to configure thresholds based on relative disk size.
The "disk" plugin collects information about the usage of physical disks and logical disks (partitions). Values collected are the number of octets written to and read from a disk or partition, the number of read/write operations issued to the disk and a rather complex "time" it took for these commands to be issued.
Using the following two options you can ignore some disks or configure the collection only of specific disks.
Disk "sdd" Disk "/hda[34]/"
See /"IGNORELISTS" for details.
UdevNameAttr "DM_NAME"
The dpdkevents plugin collects events from DPDK such as link status of network ports and Keep Alive status of DPDK logical cores. In order to get Keep Alive events following requirements must be met: - DPDK >= 16.07 - support for Keep Alive implemented in DPDK application. More details can be found here: http://dpdk.org/doc/guides/sample_app_ug/keep_alive.html
Synopsis:
<Plugin "dpdkevents"> <EAL> Coremask "0x1" MemoryChannels "4" FilePrefix "rte" </EAL> <Event "link_status"> SendEventsOnUpdate true EnabledPortMask 0xffff PortName "interface1" PortName "interface2" SendNotification false </Event> <Event "keep_alive"> SendEventsOnUpdate true LCoreMask "0xf" KeepAliveShmName "/dpdk_keepalive_shm_name" SendNotification false </Event> </Plugin>
Options:
The EAL block
The Event block
The Event block defines configuration for specific event. It accepts a single argument which specifies the name of the event.
Link Status event
Keep Alive event
The dpdkstat plugin collects information about DPDK interfaces using the extended NIC stats API in DPDK.
Synopsis:
<Plugin "dpdkstat"> <EAL> Coremask "0x4" MemoryChannels "4" FilePrefix "rte" SocketMemory "1024" LogLevel "7" RteDriverLibPath "/usr/lib/dpdk-pmd" </EAL> SharedMemObj "dpdk_collectd_stats_0" EnabledPortMask 0xffff PortName "interface1" PortName "interface2" </Plugin>
Options:
The EAL block
The ethstat plugin collects information about network interface cards (NICs) by talking directly with the underlying kernel driver using ioctl(2).
Synopsis:
<Plugin "ethstat"> Interface "eth0" Map "rx_csum_offload_errors" "if_rx_errors" "checksum_offload" Map "multicast" "if_multicast" </Plugin>
Options:
Please make sure to read collectd-exec(5) before using this plugin. It contains valuable information on when the executable is executed and the output that is expected from it.
Please note that in order to change the user and/or group the daemon needs superuser privileges. If the daemon is run as an unprivileged user you must specify the same user/group here. If the daemon is run with superuser privileges, you must supply a non-root user here.
The executable may be followed by optional arguments that are passed to the program. Please note that due to the configuration parsing numbers and boolean values may be changed. If you want to be absolutely sure that something is passed as-is please enclose it in quotes.
The Exec and NotificationExec statements change the semantics of the programs executed, i. e. the data passed to them and the response expected from them. This is documented in great detail in collectd-exec(5).
The "fhcount" plugin provides statistics about used, unused and total number of file handles on Linux.
The fhcount plugin provides the following configuration options:
The "filecount" plugin counts the number of files in a certain directory (and its subdirectories) and their combined size. The configuration is very straight forward:
<Plugin "filecount"> <Directory "/var/qmail/queue/mess"> Instance "qmail-message" </Directory> <Directory "/var/qmail/queue/todo"> Instance "qmail-todo" </Directory> <Directory "/var/lib/php5"> Instance "php5-sessions" Name "sess_*" </Directory> </Plugin>
The example above counts the number of files in QMail's queue directories and the number of PHP5 sessions. Jfiy: The "todo" queue holds the messages that QMail has not yet looked at, the "message" queue holds the messages that were classified into "local" and "remote".
As you can see, the configuration consists of one or more "Directory" blocks, each of which specifies a directory in which to count the files. Within those blocks, the following options are recognized:
The number can also be followed by a "multiplier" to easily specify a larger timespan. When given in this notation, the argument must in quoted, i. e. must be passed as string. So the -60 could also be written as "-1m" (one minute). Valid multipliers are "s" (second), "m" (minute), "h" (hour), "d" (day), "w" (week), and "y" (year). There is no "month" multiplier. You can also specify fractional numbers, e. g. "0.5d" is identical to "12h".
As with the MTime option, a "multiplier" may be added. For a detailed description see above. Valid multipliers here are "b" (byte), "k" (kilobyte), "m" (megabyte), "g" (gigabyte), "t" (terabyte), and "p" (petabyte). Please note that there are 1000 bytes in a kilobyte, not 1024.
The GenericJMX plugin is written in Java and therefore documented in collectd-java(5).
The gmond plugin received the multicast traffic sent by gmond, the statistics collection daemon of Ganglia. Mappings for the standard "metrics" are built-in, custom mappings may be added via Metric blocks, see below.
Synopsis:
<Plugin "gmond"> MCReceiveFrom "239.2.11.71" "8649" <Metric "swap_total"> Type "swap" TypeInstance "total" DataSource "value" </Metric> <Metric "swap_free"> Type "swap" TypeInstance "free" DataSource "value" </Metric> </Plugin>
The following metrics are built-in:
Available configuration options:
Default: 239.2.11.71 / 8649
The "gps plugin" connects to gpsd on the host machine. The host, port, timeout and pause are configurable.
This is useful if you run an NTP server using a GPS for source and you want to monitor it.
Mind your GPS must send $--GSA for having the data reported!
The following elements are collected:
Synopsis:
LoadPlugin gps <Plugin "gps"> # Connect to localhost on gpsd regular port: Host "127.0.0.1" Port "2947" # 15 ms timeout Timeout 0.015 # PauseConnect of 5 sec. between connection attempts. PauseConnect 5 </Plugin>
Available configuration options:
The GPS data stream is fetch by the plugin form the daemon. It waits for data to be available, if none arrives it times out and loop for another reading. Mind to put a low value gpsd expects value in the micro-seconds area (recommended is 500 us) since the waiting function is blocking. Value must be between 500 us and 5 sec., if outside that range the default value is applied.
This only applies from gpsd release-2.95.
The grpc plugin provides an RPC interface to submit values to or query values from collectd based on the open source gRPC framework. It exposes an end-point for dispatching values to the daemon.
The gRPC homepage can be found at <https://grpc.io/>.
The argument Host may be a hostname, an IPv4 address, or an IPv6 address.
Optionally, Server may be specified as a configuration block which supports the following options:
The argument Host may be a hostname, an IPv4 address, or an IPv6 address.
Optionally, Listen may be specified as a configuration block which supports the following options:
To get values from hddtemp collectd connects to localhost (127.0.0.1), port 7634/tcp. The Host and Port options can be used to change these default values, see below. "hddtemp" has to be running to work correctly. If "hddtemp" is not running timeouts may appear which may interfere with other statistics..
The hddtemp homepage can be found at <http://www.guzu.net/linux/hddtemp.php>.
To collect hugepages information, collectd reads directories "/sys/devices/system/node/*/hugepages" and "/sys/kernel/mm/hugepages". Reading of these directories can be disabled by the following options (default is enabled).
The intel_pmu plugin collects performance counters data on Intel CPUs using Linux perf interface. All events are reported on a per core basis.
Synopsis:
<Plugin intel_pmu> ReportHardwareCacheEvents true ReportKernelPMUEvents true ReportSoftwareEvents true EventList "/var/cache/pmu/GenuineIntel-6-2D-core.json" HardwareEvents "L2_RQSTS.CODE_RD_HIT,L2_RQSTS.CODE_RD_MISS" "L2_RQSTS.ALL_CODE_RD" Cores "0-3" "4,6" "[12-15]" </Plugin>
Options:
If an empty string is provided as value for this field default cores configuration is applied - that is separate group is created for each core.
The intel_rdt plugin collects information provided by monitoring features of Intel Resource Director Technology (Intel(R) RDT) like Cache Monitoring Technology (CMT), Memory Bandwidth Monitoring (MBM). These features provide information about utilization of shared resources. CMT monitors last level cache occupancy (LLC). MBM supports two types of events reporting local and remote memory bandwidth. Local memory bandwidth (MBL) reports the bandwidth of accessing memory associated with the local socket. Remote memory bandwidth (MBR) reports the bandwidth of accessing the remote socket. Also this technology allows to monitor instructions per clock (IPC). Monitor events are hardware dependant. Monitoring capabilities are detected on plugin initialization and only supported events are monitored.
Note: intel_rdt plugin is using model-specific registers (MSRs), which require an additional capability to be enabled if collectd is run as a service. Please refer to contrib/systemd.collectd.service file for more details.
Synopsis:
<Plugin "intel_rdt"> Cores "0-2" "3,4,6" "8-10,15" </Plugin>
Options:
If an empty string is provided as value for this field default cores configuration is applied - a separate group is created for each core.
Note: By default global interval is used to retrieve statistics on monitored events. To configure a plugin specific interval use Interval option of the intel_rdt <LoadPlugin> block. For milliseconds divide the time by 1000 for example if the desired interval is 50ms, set interval to 0.05. Due to limited capacity of counters it is not recommended to set interval higher than 1 sec.
See /"IGNORELISTS" for details.
It is possible to use regular expressions to match interface names, if the name is surrounded by /.../ and collectd was compiled with support for regexps. This is useful if there's a need to collect (or ignore) data for a group of interfaces that are similarly named, without the need to explicitly list all of them (especially useful if the list is dynamic). Example:
Interface "lo" Interface "/^veth/" Interface "/^tun[0-9]+/" IgnoreSelected "true"
This will ignore the loopback interface, all interfaces with names starting with veth and all interfaces with names starting with tun followed by at least one digit.
The default value is true and results in collection of the data from all interfaces that are selected by Interface and IgnoreSelected options.
This option is only available on Solaris.
The ipmi plugin allows to monitor server platform status using the Intelligent Platform Management Interface (IPMI). Local and remote interfaces are supported.
The plugin configuration consists of one or more Instance blocks which specify one ipmi connection each. Each block requires one unique string argument as the instance name. If instances are not configured, an instance with the default option values will be created.
For backwards compatibility, any option other than Instance block will trigger legacy config handling and it will be treated as an option within Instance block. This support will go away in the next major version of Collectd.
Within the Instance blocks, the following options are allowed:
See /"IGNORELISTS" for details.
If only Table and Chain are given, this plugin will collect the counters of all rules which have a comment-match. The comment is then used as type-instance.
If Comment or Number is given, only the rule with the matching comment or the nth rule will be collected. Again, the comment (or the number) will be used as the type-instance.
If Name is supplied, it will be used as the type-instance instead of the comment or the number.
See /"IGNORELISTS" for details.
The Java plugin makes it possible to write extensions for collectd in Java. This section only discusses the syntax and semantic of the configuration options. For more in-depth information on the Java plugin, please read collectd-java(5).
Synopsis:
<Plugin "java"> JVMArg "-verbose:jni" JVMArg "-Djava.class.path=/opt/collectd/lib/collectd/bindings/java" LoadPlugin "org.collectd.java.Foobar" <Plugin "org.collectd.java.Foobar"> # To be parsed by the plugin </Plugin> </Plugin>
Available configuration options:
Please note that all these options must appear before (i. e. above) any other options! When another option is found, the JVM will be started and later options will have to be ignored!
See collectd-java(5) for details.
When the first such option is found, the virtual machine (JVM) is created. This means that all JVMArg options must appear before (i. e. above) all LoadPlugin options!
For this to work, the plugin has to register a configuration callback first, see "config callback" in collectd-java(5). This means, that the Plugin block must appear after the appropriate LoadPlugin block. Also note, that Name depends on the (Java) plugin registering the callback and is completely independent from the JavaClass argument passed to LoadPlugin.
The Load plugin collects the system load. These numbers give a rough overview over the utilization of a machine. The system load is defined as the number of runnable tasks in the run-queue and is provided by many operating systems as a one, five or fifteen minute average.
The following configuration options are available:
Please note that debug is only available if collectd has been compiled with debugging support.
Note: There is no need to notify the daemon after moving or removing the log file (e. g. when rotating the logs). The plugin reopens the file for each line it writes.
The log logstash plugin behaves like the logfile plugin but formats messages as JSON events for logstash to parse and input.
Please note that debug is only available if collectd has been compiled with debugging support.
Note: There is no need to notify the daemon after moving or removing the log file (e. g. when rotating the logs). The plugin reopens the file for each line it writes.
The LPAR plugin reads CPU statistics of Logical Partitions, a virtualization technique for IBM POWER processors. It takes into account CPU time stolen from or donated to a partition, in addition to the usual user, system, I/O statistics.
The following configuration options are available:
This plugin embeds a Lua interpreter into collectd and provides an interface to collectd's plugin system. See collectd-lua(5) for its documentation.
The "mbmon plugin" uses mbmon to retrieve temperature, voltage, etc.
Be default collectd connects to localhost (127.0.0.1), port 411/tcp. The Host and Port options can be used to change these values, see below. "mbmon" has to be running to work correctly. If "mbmon" is not running timeouts may appear which may interfere with other statistics..
"mbmon" must be run with the -r option ("print TAG and Value format"); Debian's /etc/init.d/mbmon script already does this, other people will need to ensure that this is the case.
The "mcelog plugin" uses mcelog to retrieve machine check exceptions.
By default the plugin connects to "/var/run/mcelog-client" to check if the mcelog server is running. When the server is running, the plugin will tail the specified logfile to retrieve machine check exception information and send a notification with the details from the logfile. The plugin will use the mcelog client protocol to retrieve memory related machine check exceptions. Note that for memory exceptions, notifications are only sent when there is a change in the number of corrected/uncorrected memory errors.
The Memory block
Note: these options cannot be used in conjunction with the logfile options, they are mutually exclusive.
The "md plugin" collects information from Linux Software-RAID devices (md).
All reported values are of the type "md_disks". Reported type instances are active, failed (present but not operational), spare (hot stand-by) and missing (physically absent) disks.
See /"IGNORELISTS" for details.
The "memcachec plugin" connects to a memcached server, queries one or more given pages and parses the returned data according to user specification. The matches used are the same as the matches used in the "curl" and "tail" plugins.
In order to talk to the memcached server, this plugin uses the libmemcached library. Please note that there is another library with a very similar name, libmemcache (notice the missing `d'), which is not applicable.
Synopsis of the configuration:
<Plugin "memcachec"> <Page "plugin_instance"> Server "localhost" Key "page_key" Plugin "plugin_name" <Match> Regex "(\\d+) bytes sent" DSType CounterAdd Type "ipt_octets" Instance "type_instance" </Match> </Page> </Plugin>
The configuration options are:
The memcached plugin connects to a memcached server and queries statistics about cache utilization, memory and bandwidth used. <http://memcached.org/>
<Plugin "memcached"> <Instance "name"> #Host "memcache.example.com" Address "127.0.0.1" Port 11211 </Instance> </Plugin>
The plugin configuration consists of one or more Instance blocks which specify one memcached connection each. Within the Instance blocks, the following options are allowed:
The mic plugin gathers CPU statistics, memory usage and temperatures from Intel's Many Integrated Core (MIC) systems.
Synopsis:
<Plugin mic> ShowCPU true ShowCPUCores true ShowMemory true ShowTemperatures true Temperature vddg Temperature vddq IgnoreSelectedTemperature true ShowPower true Power total0 Power total1 IgnoreSelectedPower true </Plugin>
The following options are valid inside the Plugin mic block:
Known temperature names are:
Known power names are:
The memory plugin provides the following configuration options:
This is useful for deploying collectd in a heterogeneous environment in which the sizes of physical memory vary.
The modbus plugin connects to a Modbus "slave" via Modbus/TCP or Modbus/RTU and reads register values. It supports reading single registers (unsigned 16 bit values), large integer values (unsigned 32 bit values) and floating point values (two registers interpreted as IEEE floats in big endian notation).
Synopsis:
<Data "voltage-input-1"> RegisterBase 0 RegisterType float RegisterCmd ReadHolding Type voltage Instance "input-1" </Data> <Data "voltage-input-2"> RegisterBase 2 RegisterType float RegisterCmd ReadHolding Type voltage Instance "input-2" </Data> <Data "supply-temperature-1"> RegisterBase 0 RegisterType Int16 RegisterCmd ReadHolding Type temperature Instance "temp-1" </Data> <Host "modbus.example.com"> Address "192.168.0.42" Port "502" Interval 60 <Slave 1> Instance "power-supply" Collect "voltage-input-1" Collect "voltage-input-2" </Slave> </Host> <Host "localhost"> Device "/dev/ttyUSB0" Baudrate 38400 Interval 20 <Slave 1> Instance "temperature" Collect "supply-temperature-1" </Slave> </Host>
Within <Data /> blocks, the following options are allowed:
Within <Host /> blocks, the following options are allowed:
Within <Slave /> blocks, the following options are allowed:
The MQTT plugin can send metrics to MQTT (Publish blocks) and receive values from MQTT (Subscribe blocks).
Synopsis:
<Plugin mqtt> <Publish "name"> Host "mqtt.example.com" Prefix "collectd" </Publish> <Subscribe "name"> Host "mqtt.example.com" Topic "collectd/#" </Subscribe> </Plugin>
The plugin's configuration is in Publish and/or Subscribe blocks, configuring the sending and receiving direction respectively. The plugin will register a write callback named "mqtt/name" where name is the string argument given to the Publish block. Both types of blocks share many but not all of the following options. If an option is valid in only one of the blocks, it will be mentioned explicitly.
Options:
In Publish blocks, this option determines the QoS flag set on outgoing messages and defaults to 0. In Subscribe blocks, determines the maximum QoS setting the client is going to accept and defaults to 2. If the QoS flag on a message is larger than the maximum accepted QoS of a subscriber, the message's QoS will be downgraded.
An example topic name would be:
collectd/cpu-0/cpu-user
The "mysql plugin" requires mysqlclient to be installed. It connects to one or more databases when started and keeps the connection up as long as possible. When the connection is interrupted for whatever reason it will try to re-connect. The plugin will complain loudly in case anything goes wrong.
This plugin issues the MySQL "SHOW STATUS" / "SHOW GLOBAL STATUS" command and collects information about MySQL network traffic, executed statements, requests, the query cache and threads by evaluating the "Bytes_{received,sent}", "Com_*", "Handler_*", "Qcache_*" and "Threads_*" return values. Please refer to the MySQL reference manual, 5.1.6. Server Status Variables for an explanation of these values.
Optionally, master and slave statistics may be collected in a MySQL replication setup. In that case, information about the synchronization state of the nodes are collected by evaluating the "Position" return value of the "SHOW MASTER STATUS" command and the "Seconds_Behind_Master", "Read_Master_Log_Pos" and "Exec_Master_Log_Pos" return values of the "SHOW SLAVE STATUS" command. See the MySQL reference manual, 12.5.5.21 SHOW MASTER STATUS Syntax and 12.5.5.31 SHOW SLAVE STATUS Syntax for details.
Synopsis:
<Plugin mysql> <Database foo> Host "hostname" User "username" Password "password" Port "3306" MasterStats true ConnectTimeout 10 SSLKey "/path/to/key.pem" SSLCert "/path/to/cert.pem" SSLCA "/path/to/ca.pem" SSLCAPath "/path/to/cas/" SSLCipher "DHE-RSA-AES256-SHA" </Database> <Database bar> Alias "squeeze" Host "localhost" Socket "/var/run/mysql/mysqld.sock" SlaveStats true SlaveNotifications true </Database> <Database galera> Alias "galera" Host "localhost" Socket "/var/run/mysql/mysqld.sock" WsrepStats true </Database> </Plugin>
A Database block defines one connection to a MySQL database. It accepts a single argument which specifies the name of the database. None of the other options are required. MySQL will use default values as documented in the "mysql_real_connect()" and "mysql_ssl_set()" sections in the MySQL reference manual.
Port "3306"
If Host is set to localhost (the default), this setting has no effect. See the documentation for the "mysql_real_connect" function for details.
Enable the collection of wsrep plugin statistics, used in Master-Master replication setups like in MySQL Galera/Percona XtraDB Cluster. User needs only privileges to execute 'SHOW GLOBAL STATUS'
The netapp plugin can collect various performance and capacity information from a NetApp filer using the NetApp API.
Please note that NetApp has a wide line of products and a lot of different software versions for each of these products. This plugin was developed for a NetApp FAS3040 running OnTap 7.2.3P8 and tested on FAS2050 7.3.1.1L1, FAS3140 7.2.5.1 and FAS3020 7.2.4P9. It should work for most combinations of model and software version but it is very hard to test this. If you have used this plugin with other models and/or software version, feel free to send us a mail to tell us about the results, even if it's just a short "It works".
To collect these data collectd will log in to the NetApp via HTTP(S) and HTTP basic authentication.
Do not use a regular user for this! Create a special collectd user with just the minimum of capabilities needed. The user only needs the "login-http-admin" capability as well as a few more depending on which data will be collected. Required capabilities are documented below.
Synopsis
<Plugin "netapp"> <Host "netapp1.example.com"> Protocol "https" Address "10.0.0.1" Port 443 User "username" Password "aef4Aebe" Interval 30 <WAFL> Interval 30 GetNameCache true GetDirCache true GetBufferCache true GetInodeCache true </WAFL> <Disks> Interval 30 GetBusy true </Disks> <VolumePerf> Interval 30 GetIO "volume0" IgnoreSelectedIO false GetOps "volume0" IgnoreSelectedOps false GetLatency "volume0" IgnoreSelectedLatency false </VolumePerf> <VolumeUsage> Interval 30 GetCapacity "vol0" GetCapacity "vol1" IgnoreSelectedCapacity false GetSnapshot "vol1" GetSnapshot "vol3" IgnoreSelectedSnapshot false </VolumeUsage> <Quota> Interval 60 </Quota> <Snapvault> Interval 30 </Snapvault> <System> Interval 30 GetCPULoad true GetInterfaces true GetDiskOps true GetDiskIO true </System> <VFiler vfilerA> Interval 60 SnapVault true # ... </VFiler> </Host> </Plugin>
The netapp plugin accepts the following configuration options:
The VFiler block inherits all connection related settings from the surrounding Host block (which appear before the VFiler block) but they may be overwritten inside the VFiler block.
This feature is useful, for example, when using a VFiler as SnapVault target (supported since OnTap 8.1). In that case, the SnapVault statistics are not available in the host filer (vfiler0) but only in the respective VFiler context.
Optional
Type: string
Default: https
Valid options: http, https
Optional
Type: string
Default: The "host" block's name.
Optional
Type: integer
Default: 80 for protocol "http", 443 for protocol "https"
Mandatory
Type: string
Optional
Type: string
Default: name of the VFiler block
Note: This option may only be used inside VFiler blocks.
The following options decide what kind of data will be collected. You can either use them as a block and fine tune various parameters inside this block, use them as a single statement to just accept all default values, or omit it to not collect any data.
The following options are valid inside all blocks:
The System block
This will collect various performance data about the whole system.
Note: To get this data the collectd user needs the "api-perf-object-get-instances" capability.
Note: These are the same values that the NetApp CLI command "sysstat" returns in the "CPU" field.
Optional
Type: boolean
Default: true
Result: Two value lists of type "cpu", and type instances "idle" and "system".
Note: This is the same values that the NetApp CLI command "sysstat" returns in the "Net kB/s" field.
Or is it?
Optional
Type: boolean
Default: true
Result: One value list of type "if_octects".
Note: This is the same values that the NetApp CLI command "sysstat" returns in the "Disk kB/s" field.
Optional
Type: boolean
Default: true
Result: One value list of type "disk_octets".
Note: These are the same values that the NetApp CLI command "sysstat" returns in the "NFS", "CIFS", "HTTP", "FCP" and "iSCSI" fields.
Optional
Type: boolean
Default: true
Result: A variable number of value lists of type "disk_ops_complex". Each type of operation will result in one value list with the name of the operation as type instance.
The WAFL block
This will collect various performance data about the WAFL file system. At the moment this just means cache performance.
Note: To get this data the collectd user needs the "api-perf-object-get-instances" capability.
Note: The interface to get these values is classified as "Diagnostics" by NetApp. This means that it is not guaranteed to be stable even between minor releases.
Type: boolean
Default: true
Result: One value list of type "cache_ratio" and type instance "name_cache_hit".
Type: boolean
Default: true
Result: One value list of type "cache_ratio" and type instance "find_dir_hit".
Type: boolean
Default: true
Result: One value list of type "cache_ratio" and type instance "inode_cache_hit".
Optional
Type: boolean
Default: true
Result: One value list of type "cache_ratio" and type instance "buf_hash_hit".
The Disks block
This will collect performance data about the individual disks in the NetApp.
Note: To get this data the collectd user needs the "api-perf-object-get-instances" capability.
Note: This is the same values that the NetApp CLI command "sysstat" returns in the "Disk util" field. Probably.
Optional
Type: boolean
Default: true
Result: One value list of type "percent" and type instance "disk_busy".
The VolumePerf block
This will collect various performance data about the individual volumes.
You can select which data to collect about which volume using the following options. They follow the standard ignorelist semantic.
Note: To get this data the collectd user needs the api-perf-object-get-instances capability.
Since the standard ignorelist functionality is used here, you can use a string starting and ending with a slash to specify regular expression matching: To match the volumes "vol0", "vol2" and "vol7", you can use this regular expression:
GetIO "/^vol[027]$/"
If no regular expression is specified, an exact match is required. Both, regular and exact matching are case sensitive.
If no volume was specified at all for either of the three options, that data will be collected for all available volumes.
See /"IGNORELISTS" for details.
When set to false, data will only be collected for the specified volumes and all other volumes will be ignored.
If no volumes have been specified with the above Get* options, all volumes will be collected regardless of the IgnoreSelected* option.
Defaults to false
The VolumeUsage block
This will collect capacity data about the individual volumes.
Note: To get this data the collectd user needs the api-volume-list-info capability.
There will be type_instances "used" and "free" for the number of used and available bytes on the volume. If the volume has some space reserved for snapshots, a type_instance "snap_reserved" will be available. If the volume has SIS enabled, a type_instance "sis_saved" will be available. This is the number of bytes saved by the SIS feature.
Note: The current NetApp API has a bug that results in this value being reported as a 32 bit number. This plugin tries to guess the correct number which works most of the time. If you see strange values here, bug NetApp support to fix this.
Repeat this option to specify multiple volumes.
Usually, the space used for snapshots is included in the space reported as "used". If snapshot information is collected as well, the space used for snapshots is subtracted from the used space.
To make things even more interesting, it is possible to reserve space to be used for snapshots. If the space required for snapshots is less than that reserved space, there is "reserved free" and "reserved used" space in addition to "free" and "used". If the space required for snapshots exceeds the reserved space, that part allocated in the normal space is subtracted from the "used" space again.
Repeat this option to specify multiple volumes.
The Quota block
This will collect (tree) quota statistics (used disk space and number of used files). This mechanism is useful to get usage information for single qtrees. In case the quotas are not used for any other purpose, an entry similar to the following in "/etc/quotas" would be sufficient:
/vol/volA/some_qtree tree - - - - -
After adding the entry, issue "quota on -w volA" on the NetApp filer.
The SnapVault block
This will collect statistics about the time and traffic of SnapVault(R) transfers.
The "netlink" plugin uses a netlink socket to query the Linux kernel about statistics of various interface and routing aspects.
When configuring with Interface only the basic statistics will be collected, namely octets, packets, and errors. These statistics are collected by the "interface" plugin, too, so using both at the same time is no benefit.
When configured with VerboseInterface all counters except the basic ones, so that no data needs to be collected twice if you use the "interface" plugin. This includes dropped packets, received multicast packets, collisions and a whole zoo of differentiated RX and TX errors. You can try the following command to get an idea of what awaits you:
ip -s -s link list
If Interface is All, all interfaces will be selected.
QDiscs and classes are identified by their type and handle (or classid). Filters don't necessarily have a handle, therefore the parent's handle is used. The notation used in collectd differs from that used in tc(1) in that it doesn't skip the major or minor number if it's zero and doesn't print special ids by their name. So, for example, a qdisc may be identified by "pfifo_fast-1:0" even though the minor number of all qdiscs is zero and thus not displayed by tc(1).
If QDisc, Class, or Filter is given without the second argument, i. .e. without an identifier, all qdiscs, classes, or filters that are associated with that interface will be collected.
Since a filter itself doesn't necessarily have a handle, the parent's handle is used. This may lead to problems when more than one filter is attached to a qdisc or class. This isn't nice, but we don't know how this could be done any better. If you have a idea, please don't hesitate to tell us.
As with the Interface option you can specify All as the interface, meaning all interfaces.
Here are some examples to help you understand the above text more easily:
<Plugin netlink> VerboseInterface "All" QDisc "eth0" "pfifo_fast-1:0" QDisc "ppp0" Class "ppp0" "htb-1:10" Filter "ppp0" "u32-1:0" </Plugin>
See /"IGNORELISTS" for details.
The Network plugin sends data to a remote instance of collectd, receives data from a remote instance, or both at the same time. Data which has been received from the network is usually not transmitted again, but this can be activated, see the Forward option below.
The default IPv6 multicast group is "ff18::efc0:4a42". The default IPv4 multicast group is 239.192.74.66. The default UDP port is 25826.
Both, Server and Listen can be used as single option or as block. When used as block, given options are valid for this socket only. The following example will export the metrics twice: Once to an "internal" server (without encryption and signing) and one to an external server (with cryptographic signature):
<Plugin "network"> # Export to an internal server # (demonstrates usage without additional options) Server "collectd.internal.tld" # Export to an external server # (demonstrates usage with signature options) <Server "collectd.external.tld"> SecurityLevel "sign" Username "myhostname" Password "ohl0eQue" </Server> </Plugin>
The argument Host may be a hostname, an IPv4 address or an IPv6 address. The optional second argument specifies a port number or a service name. If not given, the default, 25826, is used.
The following options are recognized within Server blocks:
This feature is only available if the network plugin was linked with libgcrypt.
This feature is only available if the network plugin was linked with libgcrypt.
This feature is only available if the network plugin was linked with libgcrypt.
The argument Host may be a hostname, an IPv4 address or an IPv6 address. If the argument is a multicast address the daemon will join that multicast group. The optional second argument specifies a port number or a service name. If not given, the default, 25826, is used.
The following options are recognized within "<Listen>" blocks:
This feature is only available if the network plugin was linked with libgcrypt.
The file format is very simple: Each line consists of a username followed by a colon and any number of spaces followed by the password. To demonstrate, an example file could look like this:
user0: foo user1: bar
Each time a packet is received, the modification time of the file is checked using stat(2). If the file has been changed, the contents is re-read. While the file is being read, it is locked using fcntl(2).
On the server side, this limit should be set to the largest value used on any client. Likewise, the value on the client must not be larger than the value on the server, or data will be lost.
Compatibility: Versions prior to version 4.8 used a fixed sized buffer of 1024 bytes. Versions 4.8, 4.9 and 4.10 used a default value of 1024 bytes to avoid problems when sending data to an older server.
The nfs plugin collects information about the usage of the Network File System (NFS). It counts the number of procedure calls for each procedure, grouped by version and whether the system runs as server or client.
It is possibly to omit metrics for a specific NFS version by setting one or more of the following options to false (all of them default to true).
This plugin collects the number of connections and requests handled by the "nginx daemon" (speak: engine X), a HTTP and mail server/proxy. It queries the page provided by the "ngx_http_stub_status_module" module, which isn't compiled by default. Please refer to <http://wiki.codemongers.com/NginxStubStatusModule> for more information on how to compile and configure nginx and this module.
The following options are accepted by the "nginx plugin":
This plugin sends a desktop notification to a notification daemon, as defined in the Desktop Notification Specification. To actually display the notifications, notification-daemon is required and collectd has to be able to access the X server (i. e., the "DISPLAY" and "XAUTHORITY" environment variables have to be set correctly) and the D-Bus message bus.
The Desktop Notification Specification can be found at <http://www.galago-project.org/specs/notification/>.
The notify_email plugin uses the ESMTP library to send notifications to a configured email address.
libESMTP is available from <http://www.stafford.uklinux.net/libesmtp/>.
Available configuration options:
Default: "root@localhost"
At least one Recipient must be present for the plugin to work correctly.
Default: "localhost"
Default: 25
Default: "Collectd notify: %s@%s"
The notify_nagios plugin writes notifications to Nagios' command file as a passive service check result.
Available configuration options:
The "ntpd" plugin collects per-peer ntp data such as time offset and time dispersion.
For talking to ntpd, it mimics what the ntpdc control program does on the wire - using mode 7 specific requests. This mode is deprecated with newer ntpd releases (4.2.7p230 and later). For the "ntpd" plugin to work correctly with them, the ntp daemon must be explicitly configured to enable mode 7 (which is disabled by default). Refer to the ntp.conf(5) manual page for details.
Available configuration options for the "ntpd" plugin:
If two refclock peers use the same driver and this is false, the plugin will try to write simultaneous measurements from both to the same type instance. This will result in error messages in the log and only one set of measurements making it through.
"ln -s some.crt ./$(openssl x509 -hash -noout -in some.crt).0"
Alternatively, the package openssl-perl provides a command "c_rehash" that will generate links like the one described above for ALL certs in a given folder. Example usage: "c_rehash /path/to/certs/folder"
The olsrd plugin connects to the TCP port opened by the txtinfo plugin of the Optimized Link State Routing daemon and reads information about the current state of the meshed network.
The following configuration options are understood:
Defaults to Detail.
Defaults to Summary.
Defaults to Summary.
EXPERIMENTAL! See notes below.
The "onewire" plugin uses the owcapi library from the owfs project <http://owfs.org/> to read sensors connected via the onewire bus.
It can be used in two possible modes - standard or advanced.
In the standard mode only temperature sensors (sensors with the family code 10, 22 and 28 - e.g. DS1820, DS18S20, DS1920) can be read. If you have other sensors you would like to have included, please send a sort request to the mailing list. You can select sensors to be read or to be ignored depending on the option IgnoreSelected). When no list is provided the whole bus is walked and all sensors are read.
Hubs (the DS2409 chips) are working, but read the note, why this plugin is experimental, below.
In the advanced mode you can configure any sensor to be read (only numerical value) using full OWFS path (e.g. "/uncached/10.F10FCA000800/temperature"). In this mode you have to list all the sensors. Neither default bus walk nor IgnoreSelected are used here. Address and type (file) is extracted from the path automatically and should produce compatible structure with the "standard" mode (basically the path is expected as for example "/uncached/10.F10FCA000800/temperature" where it would extract address part "F10FCA000800" and the rest after the slash is considered the type - here "temperature"). There are two advantages to this mode - you can access virtually any sensor (not just temperature), select whether to use cached or directly read values and it is slighlty faster. The downside is more complex configuration.
The two modes are distinguished automatically by the format of the address. It is not possible to mix the two modes. Once a full path is detected in any Sensor then the whole addressing (all sensors) is considered to be this way (and as standard addresses will fail parsing they will be ignored).
Though the documentation claims to automatically recognize the given address format, with version 2.7p4 we had to specify the type explicitly. So with that version, the following configuration worked for us:
<Plugin onewire> Device "-s localhost:4304" </Plugin>
This directive is required and does not have a default value.
In the advanced mode the Sensor specifies full OWFS path - e.g. "/uncached/10.F10FCA000800/temperature" (or when cached values are OK "/10.F10FCA000800/temperature"). IgnoreSelected is not used.
As there can be multiple devices on the bus you can list multiple sensor (use multiple Sensor elements).
See /"IGNORELISTS" for details.
Used only in the standard mode - see above.
EXPERIMENTAL! The "onewire" plugin is experimental, because it doesn't yet work with big setups. It works with one sensor being attached to one controller, but as soon as you throw in a couple more senors and maybe a hub or two, reading all values will take more than ten seconds (the default interval). We will probably add some separate thread for reading the sensors and some cache or something like that, but it's not done yet. We will try to maintain backwards compatibility in the future, but we can't promise. So in short: If it works for you: Great! But keep in mind that the config might change, though this is unlikely. Oh, and if you want to help improving this plugin, just send a short notice to the mailing list. Thanks :)
To use the "openldap" plugin you first need to configure the OpenLDAP server correctly. The backend database "monitor" needs to be loaded and working. See slapd-monitor(5) for the details.
The configuration of the "openldap" plugin consists of one or more Instance blocks. Each block requires one string argument as the instance name. For example:
<Plugin "openldap"> <Instance "foo"> URL "ldap://localhost/" </Instance> <Instance "bar"> URL "ldaps://localhost/" </Instance> </Plugin>
The instance name will be used as the plugin instance. To emulate the old (version 4) behavior, you can use an empty string (""). In order for the plugin to work correctly, each instance name must be unique. This is not enforced by the plugin and it is your responsibility to ensure it is.
The following options are accepted within each Instance block:
The OpenVPN plugin reads a status file maintained by OpenVPN and gathers traffic statistics about connected clients.
To set up OpenVPN to write to the status file periodically, use the --status option of OpenVPN.
So, in a nutshell you need:
openvpn $OTHER_OPTIONS \ --status "/var/run/openvpn-status" 10
Available options:
The "oracle" plugin uses the OracleX Call Interface (OCI) to connect to an OracleX Database and lets you execute SQL statements there. It is very similar to the "dbi" plugin, because it was written around the same time. See the "dbi" plugin's documentation above for details.
<Plugin oracle> <Query "out_of_stock"> Statement "SELECT category, COUNT(*) AS value FROM products WHERE in_stock = 0 GROUP BY category" <Result> Type "gauge" # InstancePrefix "foo" InstancesFrom "category" ValuesFrom "value" </Result> </Query> <Database "product_information"> #Plugin "warehouse" ConnectID "db01" Username "oracle" Password "secret" Query "out_of_stock" </Database> </Plugin>
Query blocks
The Query blocks are handled identically to the Query blocks of the "dbi" plugin. Please see its documentation above for details on how to specify queries.
Database blocks
Database blocks define a connection to a database and which queries should be sent to that database. Each database needs a "name" as string argument in the starting tag of the block. This name will be used as "PluginInstance" in the values submitted to the daemon. Other than that, that name is not used.
The ovs_events plugin monitors the link status of Open vSwitch (OVS) connected interfaces, dispatches the values to collectd and sends the notification whenever the link state change occurs. This plugin uses OVS database to get a link state change notification.
Synopsis:
<Plugin "ovs_events"> Port 6640 Address "127.0.0.1" Socket "/var/run/openvswitch/db.sock" Interfaces "br0" "veth0" SendNotification true DispatchValues false </Plugin>
The plugin provides the following configuration options:
Default: empty (all interfaces on all bridges are monitored)
Note: By default, the global interval setting is used within which to retrieve the OVS link status. To configure a plugin-specific interval, please use Interval option of the OVS LoadPlugin block settings. For milliseconds simple divide the time by 1000 for example if the desired interval is 50ms, set interval to 0.05.
The ovs_stats plugin collects statistics of OVS connected interfaces. This plugin uses OVSDB management protocol (RFC7047) monitor mechanism to get statistics from OVSDB
Synopsis:
<Plugin "ovs_stats"> Port 6640 Address "127.0.0.1" Socket "/var/run/openvswitch/db.sock" Bridges "br0" "br_ext" </Plugin>
The plugin provides the following configuration options:
Default: empty (monitor all bridges)
This plugin embeds a Perl-interpreter into collectd and provides an interface to collectd's plugin system. See collectd-perl(5) for its documentation.
The Pinba plugin receives profiling information from Pinba, an extension for the PHP interpreter. At the end of executing a script, i.e. after a PHP-based webpage has been delivered, the extension will send a UDP packet containing timing information, peak memory usage and so on. The plugin will wait for such packets, parse them and account the provided information, which is then dispatched to the daemon once per interval.
Synopsis:
<Plugin pinba> Address "::0" Port "30002" # Overall statistics for the website. <View "www-total"> Server "www.example.com" </View> # Statistics for www-a only <View "www-a"> Host "www-a.example.com" Server "www.example.com" </View> # Statistics for www-b only <View "www-b"> Host "www-b.example.com" Server "www.example.com" </View> </Plugin>
The plugin provides the following configuration options:
The Ping plugin starts a new thread which sends ICMP "ping" packets to the configured hosts periodically and measures the network latency. Whenever the "read" function of the plugin is called, it submits the average latency, the standard deviation and the drop rate for each host.
Available configuration options:
Default: 1.0
Default: 0.9
Default: -1 (disabled)
The "postgresql" plugin queries statistics from PostgreSQL databases. It keeps a persistent connection to all configured databases and tries to reconnect if the connection has been interrupted. A database is configured by specifying a Database block as described below. The default statistics are collected from PostgreSQL's statistics collector which thus has to be enabled for this plugin to work correctly. This should usually be the case by default. See the section "The Statistics Collector" of the PostgreSQL Documentation for details.
By specifying custom database queries using a Query block as described below, you may collect any data that is available from some PostgreSQL database. This way, you are able to access statistics of external daemons which are available in a PostgreSQL database or use future or special statistics provided by PostgreSQL without the need to upgrade your collectd installation.
Starting with version 5.2, the "postgresql" plugin supports writing data to PostgreSQL databases as well. This has been implemented in a generic way. You need to specify an SQL statement which will then be executed by collectd in order to write the data (see below for details). The benefit of that approach is that there is no fixed database layout. Rather, the layout may be optimized for the current setup.
The PostgreSQL Documentation manual can be found at <http://www.postgresql.org/docs/manuals/>.
<Plugin postgresql> <Query magic> Statement "SELECT magic FROM wizard WHERE host = $1;" Param hostname <Result> Type gauge InstancePrefix "magic" ValuesFrom magic </Result> </Query> <Query rt36_tickets> Statement "SELECT COUNT(type) AS count, type \ FROM (SELECT CASE \ WHEN resolved = 'epoch' THEN 'open' \ ELSE 'resolved' END AS type \ FROM tickets) type \ GROUP BY type;" <Result> Type counter InstancePrefix "rt36_tickets" InstancesFrom "type" ValuesFrom "count" </Result> </Query> <Writer sqlstore> Statement "SELECT collectd_insert($1, $2, $3, $4, $5, $6, $7, $8, $9);" StoreRates true </Writer> <Database foo> Plugin "kingdom" Host "hostname" Port "5432" User "username" Password "secret" SSLMode "prefer" KRBSrvName "kerberos_service_name" Query magic </Database> <Database bar> Interval 300 Service "service_name" Query backends # predefined Query rt36_tickets </Database> <Database qux> # ... Writer sqlstore CommitInterval 10 </Database> </Plugin>
The Query block defines one database query which may later be used by a database definition. It accepts a single mandatory argument which specifies the name of the query. The names of all queries have to be unique (see the MinVersion and MaxVersion options below for an exception to this rule).
In each Query block, there is one or more Result blocks. Multiple Result blocks may be used to extract multiple values from a single query.
The following configuration options are available to define the query:
Any SQL command which may return data (such as "SELECT" or "SHOW") is allowed. Note, however, that only a single command may be used. Semicolons are allowed as long as a single non-empty command has been specified only.
The returned lines will be handled separately one after another.
Please note that parameters are only supported by PostgreSQL's protocol version 3 and above which was introduced in version 7.4 of PostgreSQL.
The version has to be specified as the concatenation of the major, minor and patch-level versions, each represented as two-decimal-digit numbers. For example, version 8.2.3 will become 80203.
The Result block defines how to handle the values returned from the query. It defines which column holds which value and how to dispatch that value to the daemon.
This option is mandatory.
The plugin itself does not check whether or not all built instances are different. It is your responsibility to assure that each is unique.
Both options are optional. If none is specified, the type instance will be empty.
The actual data type, as seen by PostgreSQL, is not that important as long as it represents numbers. The plugin will automatically cast the values to the right type if it know how to do that. For that, it uses the strtoll(3) and strtod(3) functions, so anything supported by those functions is supported by the plugin as well.
This option is required inside a Result block and may be specified multiple times. If multiple ValuesFrom options are specified, the columns are read in the given order.
The following predefined queries are available (the definitions can be found in the postgresql_default.conf file which, by default, is available at "prefix/share/collectd/"):
In addition, the following detailed queries are available by default. Please note that each of those queries collects information by table, thus, potentially producing a lot of data. For details see the description of the non-by_table queries above.
The Writer block defines a PostgreSQL writer backend. It accepts a single mandatory argument specifying the name of the writer. This will then be used in the Database specification in order to activate the writer instance. The names of all writers have to be unique. The following options may be specified:
Nine parameters will be passed to the statement and should be specified as tokens $1, $2, through $9 in the statement string. The following values are made available through those parameters:
In general, it is advisable to create and call a custom function in the PostgreSQL database for this purpose. Any procedural language supported by PostgreSQL will do (see chapter "Server Programming" in the PostgreSQL manual for details).
The Database block defines one PostgreSQL database for which to collect statistics. It accepts a single mandatory argument which specifies the database name. None of the other options are required. PostgreSQL will use default values as documented in the section "CONNECTING TO A DATABASE" in the psql(1) manpage. However, be aware that those defaults may be influenced by the user collectd is run as and special environment variables. See the manpage for details.
This option is also used to determine the hostname that is associated with a collected data set. If it has been omitted or either begins with with a slash or equals localhost it will be replaced with the global hostname definition of collectd. Any other value will be passed literally to collectd when dispatching values. Also see the global Hostname and FQDNLookup options.
Each writer will register a flush callback which may be used when having long transactions enabled (see the CommitInterval option above). When issuing the FLUSH command (see collectd-unixsock(5) for details) the current transaction will be committed right away. Two different kinds of flush callbacks are available with the "postgresql" plugin:
The "powerdns" plugin queries statistics from an authoritative PowerDNS nameserver and/or a PowerDNS recursor. Since both offer a wide variety of values, many of which are probably meaningless to most users, but may be useful for some. So you may chose which values to collect, but if you don't, some reasonable defaults will be collected.
<Plugin "powerdns"> <Server "server_name"> Collect "latency" Collect "udp-answers" "udp-queries" Socket "/var/run/pdns.controlsocket" </Server> <Recursor "recursor_name"> Collect "questions" Collect "cache-hits" "cache-misses" Socket "/var/run/pdns_recursor.controlsocket" </Recursor> LocalSocket "/opt/collectd/var/run/collectd-powerdns" </Plugin>
The method of getting the values differs for Server and Recursor blocks: When querying the server a "SHOW *" command is issued in any case, because that's the only way of getting multiple values out of the server at once. collectd then picks out the values you have selected. When querying the recursor, a command is generated to query exactly these values. So if you specify invalid fields when querying the recursor, a syntax error may be returned by the daemon and collectd may not collect any values at all.
If no Collect statement is given, the following Server values will be collected:
The following Recursor values will be collected by default:
Please note that up to that point collectd doesn't know what values are available on the server and values that are added do not need a change of the mechanism so far. However, the values must be mapped to collectd's naming scheme, which is done using a lookup table that lists all known values. If values are added in the future and collectd does not know about them, you will get an error much like this:
powerdns plugin: submit: Not found in lookup table: foobar = 42
In this case please file a bug report with the collectd team.
Collects information about processes of local system.
By default, with no process matches configured, only general statistics is collected: the number of processes in each state and fork rate.
Process matches can be configured by Process and ProcessMatch options. These may also be a block in which further options may be specified.
The statistics collected for matched processes are:
- size of the resident segment size (RSS)
- user- and system-time used
- number of processes
- number of threads
- number of open files (under Linux)
- number of memory mapped files (under Linux)
- io data (where available)
- context switches (under Linux)
- minor and major pagefaults.
Synopsis:
<Plugin processes> CollectFileDescriptor true CollectContextSwitch true Process "name" ProcessMatch "name" "regex" <Process "collectd"> CollectFileDescriptor false CollectContextSwitch false </Process> <ProcessMatch "name" "regex"> CollectFileDescriptor false CollectContextSwitch true </ProcessMatch> </Plugin>
Some platforms have a limit on the length of process names. Name must stay below this limit.
Options CollectContextSwitch and CollectFileDescriptor may be used inside Process and ProcessMatch blocks - then they affect corresponding match only. Otherwise they set the default value for subsequent matches.
Collects a lot of information about various network protocols, such as IP, TCP, UDP, etc.
Available configuration options:
You can use regular expressions to match a large number of values with just one configuration option. To select all "extended" TCP values, you could use the following statement:
Value "/^TcpExt:/"
Whether only matched values are selected or all matched values are ignored depends on the IgnoreSelected. By default, only matched values are selected. If no value is configured at all, all values will be selected.
See /"IGNORELISTS" for details.
This plugin embeds a Python-interpreter into collectd and provides an interface to collectd's plugin system. See collectd-python(5) for its documentation.
The "routeros" plugin connects to a device running RouterOS, the Linux-based operating system for routers by MikroTik. The plugin uses librouteros to connect and reads information about the interfaces and wireless connections of the device. The configuration supports querying multiple routers:
<Plugin "routeros"> <Router> Host "router0.example.com" User "collectd" Password "secr3t" CollectInterface true CollectCPULoad true CollectMemory true </Router> <Router> Host "router1.example.com" User "collectd" Password "5ecret" CollectInterface true CollectRegistrationTable true CollectDF true CollectDisk true </Router> </Plugin>
As you can see above, the configuration of the routeros plugin consists of one or more <Router> blocks. Within each block, the following options are understood:
The Redis plugin connects to one or more Redis servers and gathers information about each server's state. For each server there is a Node block which configures the connection parameters for this node.
<Plugin redis> <Node "example"> Host "localhost" Port "6379" Timeout 2000 <Query "LLEN myqueue"> Type "queue_length" Instance "myqueue" <Query> </Node> </Plugin>
The information shown in the synopsis above is the default configuration which is used by the plugin if no configuration is present.
The "rrdcached" plugin uses the RRDtool accelerator daemon, rrdcached(1), to store values to RRD files in an efficient manner. The combination of the "rrdcached" plugin and the "rrdcached" daemon is very similar to the way the "rrdtool" plugin works (see below). The added abstraction layer provides a number of benefits, though: Because the cache is not within "collectd" anymore, it does not need to be flushed when "collectd" is to be restarted. This results in much shorter (if any) gaps in graphs, especially under heavy load. Also, the "rrdtool" command line utility is aware of the daemon so that it can flush values to disk automatically when needed. This allows one to integrate automated flushing of values into graphing solutions much more easily.
There are disadvantages, though: The daemon may reside on a different host, so it may not be possible for "collectd" to create the appropriate RRD files anymore. And even if "rrdcached" runs on the same host, it may run in a different base directory, so relative paths may do weird stuff if you're not careful.
So the recommended configuration is to let "collectd" and "rrdcached" run on the same host, communicating via a UNIX domain socket. The DataDir setting should be set to an absolute path, so that a changed base directory does not result in RRD files being created / expected in the wrong place.
<Plugin "rrdcached"> DaemonAddress "unix:/var/run/rrdcached.sock" </Plugin>
So for each timespan, it calculates how many PDPs need to be
consolidated into one CDP by calculating:
number of PDPs = timespan / (stepsize * rrarows)
Bottom line is, set this no smaller than the width of you graphs in pixels. The default is 1200.
For more information on how RRA-sizes are calculated see RRARows above.
Statistics are read via rrdcacheds socket using the STATS command. See rrdcached(1) for details.
You can use the settings StepSize, HeartBeat, RRARows, and XFF to fine-tune your RRD-files. Please read rrdcreate(1) if you encounter problems using these settings. If you don't want to dive into the depths of RRDtool, you can safely ignore these settings.
So for each timespan, it calculates how many PDPs need to be
consolidated into one CDP by calculating:
number of PDPs = timespan / (stepsize * rrarows)
Bottom line is, set this no smaller than the width of you graphs in pixels. The default is 1200.
For more information on how RRA-sizes are calculated see RRARows above.
Defaults to 10x CacheTimeout. CacheFlush must be larger than or equal to CacheTimeout, otherwise the above default is used.
This setting is designed for very large setups. Setting this option to a value between 25 and 80 updates per second, depending on your hardware, will leave the server responsive enough to draw graphs even while all the cached values are written to disk. Flushed values, i. e. values that are forced to disk by the FLUSH command, are not effected by this limit. They are still written as fast as possible, so that web frontends have up to date data when generating graphs.
For example: If you have 100,000 RRD files and set WritesPerSecond to 30 updates per second, writing all values to disk will take approximately 56 minutes. Together with the flushing ability that's integrated into "collection3" you'll end up with a responsive and fast system, up to date graphs and basically a "backup" of your values every hour.
The Sensors plugin uses lm_sensors to retrieve sensor-values. This means that all the needed modules have to be loaded and lm_sensors has to be configured (most likely by editing /etc/sensors.conf. Read sensors.conf(5) for details.
The lm_sensors homepage can be found at <http://secure.netroedge.com/~lm78/>.
See /"IGNORELISTS" for details.
The sigrok plugin uses libsigrok to retrieve measurements from any device supported by the sigrok <http://sigrok.org/> project.
Synopsis
<Plugin sigrok> LogLevel 3 <Device "AC Voltage"> Driver "fluke-dmm" MinimumInterval 10 Conn "/dev/ttyUSB2" </Device> <Device "Sound Level"> Driver "cem-dt-885x" Conn "/dev/ttyUSB1" </Device> </Plugin>
The default MinimumInterval is 0, meaning measurements received from the device are always dispatched to collectd. When throttled, unused measurements are discarded.
The "smart" plugin collects SMART information from physical disks. Values collectd include temperature, power cycle count, poweron time and bad sectors. Also, all SMART attributes are collected along with the normalized current value, the worst value, the threshold and a human readable value.
Using the following two options you can ignore some disks or configure the collection only of specific disks.
Disk "sdd" Disk "/hda[34]/"
See /"IGNORELISTS" for details.
Since the configuration of the "snmp plugin" is a little more complicated than other plugins, its documentation has been moved to an own manpage, collectd-snmp(5). Please see there for details.
The snmp_agent plugin is an AgentX subagent that receives and handles queries from SNMP master agent and returns the data collected by read plugins. The snmp_agent plugin handles requests only for OIDs specified in configuration file. To handle SNMP queries the plugin gets data from collectd and translates requested values from collectd's internal format to SNMP format. This plugin is a generic plugin and cannot work without configuration. For more details on AgentX subagent see <http://www.net-snmp.org/tutorial/tutorial-5/toolkit/demon/>
Synopsis:
<Plugin snmp_agent> <Data "memAvailReal"> Plugin "memory" #PluginInstance "some" Type "memory" TypeInstance "free" OIDs "1.3.6.1.4.1.2021.4.6.0" </Data> <Table "ifTable"> IndexOID "IF-MIB::ifIndex" SizeOID "IF-MIB::ifNumber" <Data "ifDescr"> Instance true Plugin "interface" OIDs "IF-MIB::ifDescr" </Data> <Data "ifOctets"> Plugin "interface" Type "if_octets" TypeInstance "" OIDs "IF-MIB::ifInOctets" "IF-MIB::ifOutOctets" </Data> </Table> </Plugin>
There are two types of blocks that can be contained in the "<Plugin snmp_agent>" block: Data and Table:
The Data block
The Data block defines a list OIDs that are to be handled. This block can define scalar or table OIDs. If Data block is defined inside of Table block it reperesents table OIDs. The following options can be set:
The Table block
The Table block defines a collection of Data blocks that belong to one snmp table. In addition to multiple Data blocks the following options can be set:
The statsd plugin listens to a UDP socket, reads "events" in the statsd protocol and dispatches rates or other aggregates of these numbers periodically.
The plugin implements the Counter, Timer, Gauge and Set types which are dispatched as the collectd types "derive", "latency", "gauge" and "objects" respectively.
The following configuration options are valid:
Different percentiles can be calculated by setting this option several times. If none are specified, no percentiles are calculated / dispatched.
Please note what reported timer values less than 0.001 are ignored in all Timer* reports.
The Swap plugin collects information about used and available swap space. On Linux and Solaris, the following options are available:
This option is only available if the Swap plugin can read "/proc/swaps" (under Linux) or use the swapctl(2) mechanism (under Solaris).
This is useful for deploying collectd in a heterogeneous environment, where swap sizes differ and you want to specify generic thresholds or similar.
This is useful for the cases when swap IO is not neccessary, is not available, or is not reliable.
Please note that debug is only available if collectd has been compiled with debugging support.
The "table plugin" provides generic means to parse tabular data and dispatch user specified values. Values are selected based on column numbers. For example, this plugin may be used to get values from the Linux proc(5) filesystem or CSV (comma separated values) files.
<Plugin table> <Table "/proc/slabinfo"> #Plugin "slab" Instance "slabinfo" Separator " " <Result> Type gauge InstancePrefix "active_objs" InstancesFrom 0 ValuesFrom 1 </Result> <Result> Type gauge InstancePrefix "objperslab" InstancesFrom 0 ValuesFrom 4 </Result> </Table> </Plugin>
The configuration consists of one or more Table blocks, each of which configures one file to parse. Within each Table block, there are one or more Result blocks, which configure which data to select and how to interpret it.
The following options are available inside a Table block:
A horizontal tab, newline and carriage return may be specified by "\\t", "\\n" and "\\r" respectively. Please note that the double backslashes are required because of collectd's config parsing.
The following options are available inside a Result block:
The plugin itself does not check whether or not all built instances are different. ItXs your responsibility to assure that each is unique. This is especially true, if you do not specify InstancesFrom: You have to make sure that the table only contains one row.
If neither InstancePrefix nor InstancesFrom is given, the type instance will be empty.
The "tail plugin" follows logfiles, just like tail(1) does, parses each line and dispatches found values. What is matched can be configured by the user using (extended) regular expressions, as described in regex(7).
<Plugin "tail"> <File "/var/log/exim4/mainlog"> Plugin "mail" Instance "exim" Interval 60 <Match> Regex "S=([1-9][0-9]*)" DSType "CounterAdd" Type "ipt_bytes" Instance "total" </Match> <Match> Regex "\\<R=local_user\\>" ExcludeRegex "\\<R=local_user\\>.*mail_spool defer" DSType "CounterInc" Type "counter" Instance "local_user" </Match> <Match> Regex "l=([0-9]*\\.[0-9]*)" <DSType "Distribution"> Percentile 99 Bucket 0 100 #BucketType "bucket" </DSType> Type "latency" Instance "foo" </Match> </File> </Plugin>
The config consists of one or more File blocks, each of which configures one logfile to parse. Within each File block, there are one or more Match blocks, which configure a regular expression to search for.
The Plugin and Instance options in the File block may be used to set the plugin name and instance respectively. So in the above example the plugin name "mail-exim" would be used.
These options are applied for all Match blocks that follow it, until the next Plugin or Instance option. This way you can extract several plugin instances from one logfile, handy when parsing syslog and the like.
The Interval option allows you to define the length of time between reads. If this is not set, the default Interval will be used.
Each Match block has the following options to describe how the match should be performed:
Regex "SPAM \\(Score: (-?[0-9]+\\.[0-9]+)\\)"
ExcludeRegex "127\\.0\\.0\\.1"
This option must be used together with the Percentile and/or Bucket options.
Synopsis:
<DSType "Distribution"> Percentile 99 Bucket 0 100 BucketType "bucket" </DSType>
Metrics are reported with the type Type (the value of the above option) and the type instance "[<Instance>-]<Percent>".
This option may be repeated to calculate more than one percentile.
To export the entire (0Xinf) range without overlap, use the upper bound of the previous range as the lower bound of the following range. In other words, use the following schema:
Bucket 0 1 Bucket 1 2 Bucket 2 5 Bucket 5 10 Bucket 10 20 Bucket 20 50 Bucket 50 0
Metrics are reported with the type set by BucketType option ("bucket" by default) and the type instance "<Type>[-<Instance>]-<lower_bound>_<upper_bound>".
This option may be repeated to calculate more than one rate.
The Gauge* and Distribution types interpret the submatch as a floating point number, using strtod(3). The Counter* and AbsoluteSet types interpret the submatch as an unsigned integer using strtoull(3). The Derive* types interpret the submatch as a signed integer using strtoll(3). CounterInc and DeriveInc do not use the submatch at all and it may be omitted in this case.
The tail_csv plugin reads files in the CSV format, e.g. the statistics file written by Snort.
Synopsis:
<Plugin "tail_csv"> <Metric "snort-dropped"> Type "percent" Instance "dropped" Index 1 </Metric> <File "/var/log/snort/snort.stats"> Plugin "snortstats" Instance "eth0" Interval 600 Collect "snort-dropped" </File> </Plugin>
The configuration consists of one or more Metric blocks that define an index into the line of the CSV file and how this value is mapped to collectd's internal representation. These are followed by one or more Instance blocks which configure which file to read, in which interval and which metrics to extract.
The "teamspeak2 plugin" connects to the query port of a teamspeak2 server and polls interesting global and virtual server data. The plugin can query only one physical server but unlimited virtual servers. You can use the following options to configure it:
Server "8767"
This option, although numeric, needs to be a string, i. e. you must use quotes around it! If no such statement is given only global information will be collected.
The TED plugin connects to a device of "The Energy Detective", a device to measure power consumption. These devices are usually connected to a serial (RS232) or USB port. The plugin opens a configured device and tries to read the current energy readings. For more information on TED, visit <http://www.theenergydetective.com/>.
Available configuration options:
Default: /dev/ttyUSB0
Default: 0
The "tcpconns plugin" counts the number of currently established TCP connections based on the local port and/or the remote port. Since there may be a lot of connections the default if to count all connections with a local port, for which a listening socket is opened. You can use the following options to fine-tune the ports you are interested in:
See /"IGNORELISTS" for details.
The Threshold plugin checks values collected or received by collectd against a configurable threshold and issues notifications if values are out of bounds.
Documentation for this plugin is available in the collectd-threshold(5) manual page.
The TokyoTyrant plugin connects to a TokyoTyrant server and collects a couple metrics: number of records, and database size on disk.
The Turbostat plugin reads CPU frequency and C-state residency on modern Intel processors by using Model Specific Registers.
Currently supported C-states (by this plugin): 3, 6, 7
Example:
All states (3, 6 and 7): (1<<3) + (1<<6) + (1<<7) = 392
Currently supported C-states (by this plugin): 2, 3, 6, 7, 8, 9, 10
Example:
States 2, 3, 6 and 7: (1<<2) + (1<<3) + (1<<6) + (1<<7) = 396
This plugin, if loaded, causes the Hostname to be taken from the machine's UUID. The UUID is a universally unique designation for the machine, usually taken from the machine's BIOS. This is most useful if the machine is running in a virtual environment such as Xen, in which case the UUID is preserved across shutdowns and migration.
The following methods are used to find the machine's UUID, in order:
If no UUID can be found then the hostname is not modified.
The varnish plugin collects information about Varnish, an HTTP accelerator. It collects a subset of the values displayed by varnishstat(1), and organizes them in categories which can be enabled or disabled. Currently only metrics shown in varnishstat(1)'s MAIN section are collected. The exact meaning of each metric can be found in varnish-counters(7).
Synopsis:
<Plugin "varnish"> <Instance "example"> CollectBackend true CollectBan false CollectCache true CollectConnections true CollectDirectorDNS false CollectESI false CollectFetch false CollectHCB false CollectObjects false CollectPurge false CollectSession false CollectSHM true CollectSMA false CollectSMS false CollectSM false CollectStruct false CollectTotals false CollectUptime false CollectVCL false CollectVSM false CollectWorkers false CollectLock false CollectMempool false CollectManagement false CollectSMF false CollectVBE false CollectMSE false </Instance> </Plugin>
The configuration consists of one or more <Instance Name> blocks. Name is the parameter passed to "varnishd -n". If left empty, it will collectd statistics from the default "varnishd" instance (this should work fine in most cases).
Inside each <Instance> blocks, the following options are recognized:
This plugin allows CPU, disk, network load and other metrics to be collected for virtualized guests on the machine. The statistics are collected through libvirt API (<http://libvirt.org/>). Majority of metrics can be gathered without installing any additional software on guests, especially collectd, which runs only on the host system.
Only Connection is required.
Connection "xen:///"
Details which URIs allowed are given at <http://libvirt.org/uri.html>.
Refreshing the devices in particular is quite a costly operation, so if your virtualization setup is static you might consider increasing this. If this option is set to 0, refreshing is disabled completely.
If IgnoreSelected is not given or false then only the listed domains and disk/network devices are collected.
If IgnoreSelected is true then the test is reversed and the listed domains and disk/network devices are ignored, while the rest are collected.
The domain name and device names may use a regular expression, if the name is surrounded by /.../ and collectd was compiled with support for regexps.
The default is to collect statistics for all domains and all their devices.
Example:
BlockDevice "/:hdb/" IgnoreSelected "true"
Ignore all hdb devices on any domain, but other block devices (eg. hda) will be collected.
If BlockDeviceFormat is set to source, then metrics will be reported using the path of the source, e.g. an image file. This corresponds to the "<source>" node in the XML definition of the domain.
Example:
If the domain XML have the following device defined:
<disk type='block' device='disk'> <driver name='qemu' type='raw' cache='none' io='native' discard='unmap'/> <source dev='/var/lib/libvirt/images/image1.qcow2'/> <target dev='sda' bus='scsi'/> <boot order='2'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk>
Setting "BlockDeviceFormat target" will cause the type instance to be set to "sda". Setting "BlockDeviceFormat source" will cause the type instance to be set to "var_lib_libvirt_images_image1.qcow2".
Example:
Assume the device path (source tag) is "/var/lib/libvirt/images/image1.qcow2". Setting "BlockDeviceFormatBasename false" will cause the type instance to be set to "var_lib_libvirt_images_image1.qcow2". Setting "BlockDeviceFormatBasename true" will cause the type instance to be set to "image1.qcow2".
uuid means use the guest's UUID. This is useful if you want to track the same guest across migrations.
hostname means to use the global Hostname setting, which is probably not useful on its own because all guests will appear to have the same name.
You can also specify combinations of these fields. For example name uuid means to concatenate the guest name and UUID (with a literal colon character between, thus "foo:1234-1234-1234-1234").
At the moment of writing (collectd-5.5), hostname string is limited to 62 characters. In case when combination of fields exceeds 62 characters, hostname will be truncated without a warning.
address means use the interface's mac address. This is useful since the interface path might change between reboots of a guest or across migrations.
name means use the guest's name as provided by the hypervisor. uuid means use the guest's UUID.
You can also specify combinations of the name and uuid fields. For example name uuid means to concatenate the guest name and UUID (with a literal colon character between, thus "foo:1234-1234-1234-1234").
Currently supported selectors are:
The "vmem" plugin collects information about the usage of virtual memory. Since the statistics provided by the Linux kernel are very detailed, they are collected very detailed. However, to get all the details, you have to switch them on manually. Most people just want an overview over, such as the number of pages read from swap space.
This plugin doesn't have any options. VServer support is only available for Linux. It cannot yet be found in a vanilla kernel, though. To make use of this plugin you need a kernel that has VServer support built in, i. e. you need to apply the patches and compile your own kernel, which will then provide the /proc/virtual filesystem that is required by this plugin.
The VServer homepage can be found at <http://linux-vserver.org/>.
Note: The traffic collected by this plugin accounts for the amount of traffic passing a socket which might be a lot less than the actual on-wire traffic (e. g. due to headers and retransmission). If you want to collect on-wire traffic you could, for example, use the logging facilities of iptables to feed data for the guest IPs into the iptables plugin.
The "write_graphite" plugin writes data to Graphite, an open-source metrics storage and graphing project. The plugin connects to Carbon, the data layer of Graphite, via TCP or UDP and sends data via the "line based" protocol (per default using port 2003). The data will be sent in blocks of at most 1428 bytes to minimize the number of network packets.
Synopsis:
<Plugin write_graphite> <Node "example"> Host "localhost" Port "2003" Protocol "tcp" LogSendErrors true Prefix "collectd" </Node> </Plugin>
The configuration consists of one or more <Node Name> blocks. Inside the Node blocks, the following options are recognized:
The "write_log" plugin writes metrics as INFO log messages.
This plugin supports two output formats: Graphite and JSON.
Synopsis:
<Plugin write_log> Format Graphite </Plugin>
The "write_tsdb" plugin writes data to OpenTSDB, a scalable open-source time series database. The plugin connects to a TSD, a masterless, no shared state daemon that ingests metrics and stores them in HBase. The plugin uses TCP over the "line based" protocol with a default port 4242. The data will be sent in blocks of at most 1428 bytes to minimize the number of network packets.
Synopsis:
<Plugin write_tsdb> ResolveInterval 60 ResolveJitter 60 <Node "example"> Host "tsd-1.my.domain" Port "4242" HostTags "status=production" </Node> </Plugin>
The configuration consists of one or more <Node Name> blocks and global directives.
Global directives are:
You can also define a jitter, a random interval to wait in addition to ResolveInterval. This prevents all your collectd servers to resolve the hostname at the same time when the connection fails. Defaults to the Interval of the write_tsdb plugin, e.g. 10 seconds.
Note: If the DNS resolution has already been successful when the socket closes, the plugin will try to reconnect immediately with the cached information. DNS is queried only when the socket is closed for a longer than ResolveInterval + ResolveJitter seconds.
Inside the Node blocks, the following options are recognized:
The write_mongodb plugin will send values to MongoDB, a schema-less NoSQL database.
Synopsis:
<Plugin "write_mongodb"> <Node "default"> Host "localhost" Port "27017" Timeout 1000 StoreRates true </Node> </Plugin>
The plugin can send values to multiple instances of MongoDB by specifying one Node block for each instance. Within the Node blocks, the following options are available:
The write_prometheus plugin implements a tiny webserver that can be scraped using Prometheus.
Options:
Background:
Prometheus has a global setting, "StalenessDelta", which controls after which time a metric without updates is considered "stale". This setting effectively puts an upper limit on the interval in which metrics are reported.
When the write_prometheus plugin encounters a metric with an interval exceeding this limit, it will inform you, the user, and provide the metric to Prometheus without a timestamp. That causes Prometheus to consider the metric "fresh" each time it is scraped, with the time of the scrape being considered the time of the update. The result is that there appear more datapoints in Prometheus than were actually created, but at least the metric doesn't disappear periodically.
This output plugin submits values to an HTTP server using POST requests and encoding metrics with JSON or using the "PUTVAL" command described in collectd-unixsock(5).
Synopsis:
<Plugin "write_http"> <Node "example"> URL "http://example.com/post-collectd" User "collectd" Password "weCh3ik0" Format JSON </Node> </Plugin>
The plugin can send values to multiple HTTP servers by specifying one <Node Name> block for each server. Within each Node block, the following options are available:
Header "X-Custom-Header: custom_value"
Defaults to Command.
Consider the two given strings to be the key and value of an additional tag for each metric being sent out.
You can add multiple Attribute.
Sets the Cassandra ttl for the data points.
Please refer to <http://kairosdb.github.io/docs/build/html/restapi/AddDataPoints.html?highlight=ttl>
Sets the metrics prefix string. Defaults to collectd.
The "write_http" plugin regularly submits the collected values to the HTTP server. How frequently this happens depends on how much data you are collecting and the size of BufferSize. The optimal value to set Timeout to is slightly below this interval, which you can estimate by monitoring the network traffic between collectd and the HTTP server.
The write_kafka plugin will send values to a Kafka topic, a distributed queue. Synopsis:
<Plugin "write_kafka"> Property "metadata.broker.list" "broker1:9092,broker2:9092" <Topic "collectd"> Format JSON </Topic> </Plugin>
The following options are understood by the write_kafka plugin:
If set to JSON, the values are encoded in the JavaScript Object Notation, an easy and straight forward exchange format.
If set to Graphite, values are encoded in the Graphite format, which is "<metric> <value> <timestamp>\n".
Please note that currently this option is only used if the Format option has been set to JSON.
This will be reflected in the "ds_type" tag: If StoreRates is enabled, converted values will have "rate" appended to the data source type, e.g. "ds_type:derive:rate".
The write_redis plugin submits values to Redis, a data structure server.
Synopsis:
<Plugin "write_redis"> <Node "example"> Host "localhost" Port "6379" Timeout 1000 Prefix "collectd/" Database 1 MaxSetSize -1 MaxSetDuration -1 StoreRates true </Node> </Plugin>
Values are submitted to Sorted Sets, using the metric name as the key, and the timestamp as the score. Retrieving a date range can then be done using the "ZRANGEBYSCORE" Redis command. Additionally, all the identifiers of these Sorted Sets are kept in a Set called "collectd/values" (or "${prefix}/values" if the Prefix option was specified) and can be retrieved using the "SMEMBERS" Redis command. You can specify the database to use with the Database parameter (default is 0). See <http://redis.io/commands#sorted_set> and <http://redis.io/commands#set> for details.
The information shown in the synopsis above is the default configuration which is used by the plugin if no configuration is present.
The plugin can send values to multiple instances of Redis by specifying one Node block for each instance. Within the Node blocks, the following options are available:
The write_riemann plugin will send values to Riemann, a powerful stream aggregation and monitoring system. The plugin sends Protobuf encoded data to Riemann using UDP packets.
Synopsis:
<Plugin "write_riemann"> <Node "example"> Host "localhost" Port "5555" Protocol UDP StoreRates true AlwaysAppendDS false TTLFactor 2.0 </Node> Tag "foobar" Attribute "foo" "bar" </Plugin>
The following options are understood by the write_riemann plugin:
Notifications are not batched and sent as soon as possible.
When enabled, it can occur that events get processed by the Riemann server close to or after their expiration time. Tune the TTLFactor and BatchMaxSize settings according to the amount of values collected, if this is an issue.
Defaults to true
This will be reflected in the "ds_type" tag: If StoreRates is enabled, converted values will have "rate" appended to the data source type, e.g. "ds_type:derive:rate".
The write_sensu plugin will send values to Sensu, a powerful stream aggregation and monitoring system. The plugin sends JSON encoded data to a local Sensu client using a TCP socket.
At the moment, the write_sensu plugin does not send over a collectd_host parameter so it is not possible to use one collectd instance as a gateway for others. Each collectd host must pair with one Sensu client.
Synopsis:
<Plugin "write_sensu"> <Node "example"> Host "localhost" Port "3030" StoreRates true AlwaysAppendDS false MetricHandler "influx" MetricHandler "default" NotificationHandler "flapjack" NotificationHandler "howling_monkey" Notifications true </Node> Tag "foobar" Attribute "foo" "bar" </Plugin>
The following options are understood by the write_sensu plugin:
This will be reflected in the "collectd_data_source_type" tag: If StoreRates is enabled, converted values will have "rate" appended to the data source type, e.g. "collectd_data_source_type:derive:rate".
This plugin collects metrics of hardware CPU load for machine running Xen hypervisor. Load is calculated from 'idle time' value, provided by Xen. Result is reported using the "percent" type, for each CPU (core).
This plugin doesn't have any options (yet).
The zookeeper plugin will collect statistics from a Zookeeper server using the mntr command. It requires Zookeeper 3.4.0+ and access to the client port.
Synopsis:
<Plugin "zookeeper"> Host "127.0.0.1" Port "2181" </Plugin>
Starting with version 4.3.0 collectd has support for monitoring. By that we mean that the values are not only stored or sent somewhere, but that they are judged and, if a problem is recognized, acted upon. The only action collectd takes itself is to generate and dispatch a "notification". Plugins can register to receive notifications and perform appropriate further actions.
Since systems and what you expect them to do differ a lot, you can configure thresholds for your values freely. This gives you a lot of flexibility but also a lot of responsibility.
Every time a value is out of range a notification is dispatched. This means that the idle percentage of your CPU needs to be less then the configured threshold only once for a notification to be generated. There's no such thing as a moving average or similar - at least not now.
Also, all values that match a threshold are considered to be relevant or "interesting". As a consequence collectd will issue a notification if they are not received for Timeout iterations. The Timeout configuration option is explained in section "GLOBAL OPTIONS". If, for example, Timeout is set to "2" (the default) and some hosts sends it's CPU statistics to the server every 60 seconds, a notification will be dispatched after about 120 seconds. It may take a little longer because the timeout is checked only once each Interval on the server.
When a value comes within range again or is received after it was missing, an "OKAY-notification" is dispatched.
Here is a configuration example to get you started. Read below for more information.
<Plugin threshold> <Type "foo"> WarningMin 0.00 WarningMax 1000.00 FailureMin 0.00 FailureMax 1200.00 Invert false Instance "bar" </Type> <Plugin "interface"> Instance "eth0" <Type "if_octets"> FailureMax 10000000 DataSource "rx" </Type> </Plugin> <Host "hostname"> <Type "cpu"> Instance "idle" FailureMin 10 </Type> <Plugin "memory"> <Type "memory"> Instance "cached" WarningMin 100000000 </Type> </Plugin> </Host> </Plugin>
There are basically two types of configuration statements: The "Host", "Plugin", and "Type" blocks select the value for which a threshold should be configured. The "Plugin" and "Type" blocks may be specified further using the "Instance" option. You can combine the block by nesting the blocks, though they must be nested in the above order, i. e. "Host" may contain either "Plugin" and "Type" blocks, "Plugin" may only contain "Type" blocks and "Type" may not contain other blocks. If multiple blocks apply to the same value the most specific block is used.
The other statements specify the threshold to configure. They must be included in a "Type" block. Currently the following statements are recognized:
Normally, all data sources are checked against a configured threshold. If this is undesirable, or if you want to specify different limits for each data source, you can use the DataSource option to have a threshold apply only to one data source.
This applies to missing values, too: If set to true a notification about a missing value is generated once every Interval seconds. If set to false only one such notification is generated until the value appears again.
This is useful when short bursts are not a problem. If, for example, 100% CPU usage for up to a minute is normal (and data is collected every 10 seconds), you could set Hits to 6 to account for this.
If, for example, the threshold is configures as
WarningMax 100.0 Hysteresis 1.0
then a Warning notification is created when the value exceeds 101 and the corresponding Okay notification is only created once the value falls below 99, thus avoiding the "flapping".
Starting with collectd 4.6 there is a powerful filtering infrastructure implemented in the daemon. The concept has mostly been copied from ip_tables, the packet filter infrastructure for Linux. We'll use a similar terminology, so that users that are familiar with iptables feel right at home.
The following are the terms used in the remainder of the filter configuration documentation. For an ASCII-art schema of the mechanism, see "General structure" below.
Matches are implemented in plugins which you have to load prior to using the match. The name of such plugins starts with the "match_" prefix.
Some of these targets are built into the daemon, see "Built-in targets" below. Other targets are implemented in plugins which you have to load prior to using the target. The name of such plugins starts with the "target_" prefix.
The following shows the resulting structure:
+---------+ ! Chain ! +---------+ ! V +---------+ +---------+ +---------+ +---------+ ! Rule !->! Match !->! Match !->! Target ! +---------+ +---------+ +---------+ +---------+ ! V +---------+ +---------+ +---------+ ! Rule !->! Target !->! Target ! +---------+ +---------+ +---------+ ! V : : ! V +---------+ +---------+ +---------+ ! Rule !->! Match !->! Target ! +---------+ +---------+ +---------+ ! V +---------+ ! Default ! ! Target ! +---------+
There are four ways to control which way a value takes through the filter mechanism:
The configuration reflects this structure directly:
PostCacheChain "PostCache" <Chain "PostCache"> <Rule "ignore_mysql_show"> <Match "regex"> Plugin "^mysql$" Type "^mysql_command$" TypeInstance "^show_" </Match> <Target "stop"> </Target> </Rule> <Target "write"> Plugin "rrdtool" </Target> </Chain>
The above configuration example will ignore all values where the plugin field is "mysql", the type is "mysql_command" and the type instance begins with "show_". All other values will be sent to the "rrdtool" write plugin via the default target of the chain. Since this chain is run after the value has been added to the cache, the MySQL "show_*" command statistics will be available via the "unixsock" plugin.
To understand the implications, it's important you know what is going on inside collectd. The following diagram shows how values are passed from the read-plugins to the write-plugins:
+---------------+ ! Read-Plugin ! +-------+-------+ ! + - - - - V - - - - + : +---------------+ : : ! Pre-Cache ! : : ! Chain ! : : +-------+-------+ : : ! : : V : : +-------+-------+ : +---------------+ : ! Cache !--->! Value Cache ! : ! insert ! : +---+---+-------+ : +-------+-------+ : ! ! : ! ,------------' ! : V V : V : +-------+---+---+ : +-------+-------+ : ! Post-Cache +--->! Write-Plugins ! : ! Chain ! : +---------------+ : +---------------+ : : : : dispatch values : + - - - - - - - - - +
After the values are passed from the "read" plugins to the dispatch functions, the pre-cache chain is run first. The values are added to the internal cache afterwards. The post-cache chain is run after the values have been added to the cache. So why is it such a huge deal if chains are run before or after the values have been added to this cache?
Targets that change the identifier of a value list should be executed before the values are added to the cache, so that the name in the cache matches the name that is used in the "write" plugins. The "unixsock" plugin, too, uses this cache to receive a list of all available values. If you change the identifier after the value list has been added to the cache, this may easily lead to confusion, but it's not forbidden of course.
The cache is also used to convert counter values to rates. These rates are, for example, used by the "value" match (see below). If you use the rate stored in the cache before the new value is added, you will use the old, previous rate. Write plugins may use this rate, too, see the "csv" plugin, for example. The "unixsock" plugin uses these rates too, to implement the "GETVAL" command.
Last but not last, the stop target makes a difference: If the pre-cache chain returns the stop condition, the value will not be added to the cache and the post-cache chain will not be run.
Within the Chain block, there can be Rule blocks and Target blocks.
Within the Rule block, there may be any number of Match blocks and there must be at least one Target block.
The arguments inside the Match block are passed to the plugin implementing the match, so which arguments are valid here depends on the plugin being used. If you do not need any to pass any arguments to a match, you can use the shorter syntax:
Match "foobar"
Which is equivalent to:
<Match "foobar"> </Match>
The arguments inside the Target block are passed to the plugin implementing the target, so which arguments are valid here depends on the plugin being used. If you do not need any to pass any arguments to a target, you can use the shorter syntax:
Target "stop"
This is the same as writing:
<Target "stop"> </Target>
The following targets are built into the core daemon and therefore need no plugins to be loaded:
This target does not have any options.
Example:
Target "return"
This target does not have any options.
Example:
Target "stop"
Available options:
If no plugin is explicitly specified, the values will be sent to all available write plugins.
Single-instance plugin example:
<Target "write"> Plugin "rrdtool" </Target>
Multi-instance plugin example:
<Plugin "write_graphite"> <Node "foo"> ... </Node> <Node "bar"> ... </Node> </Plugin> ... <Target "write"> Plugin "write_graphite/foo" </Target>
Available options:
Example:
<Target "jump"> Chain "foobar" </Target>
Available options:
Example:
<Match "regex"> Host "customer[0-9]+" Plugin "^foobar$" </Match>
This match is mainly intended for servers that receive values over the "network" plugin and write them to disk using the "rrdtool" plugin. RRDtool is very sensitive to the timestamp used when updating the RRD files. In particular, the time must be ever increasing. If a misbehaving client sends one packet with a timestamp far in the future, all further packets with a correct time will be ignored because of that one packet. What's worse, such corrupted RRD files are hard to fix.
This match lets one match all values outside a specified time range (relative to the server's time), so you can use the stop target (see below) to ignore the value, for example.
Available options:
Example:
<Match "timediff"> Future 300 Past 3600 </Match>
This example matches all values that are five minutes or more ahead of the server or one hour (or more) lagging behind.
Available options:
Usually All is used for positive matches, Any is used for negative matches. This means that with All you usually check that all values are in a "good" range, while with Any you check if any value is within a "bad" range (or outside the "good" range).
Either Min or Max, but not both, may be unset.
Example:
# Match all values smaller than or equal to 100. Matches only if all data # sources are below 100. <Match "value"> Max 100 Satisfy "All" </Match> # Match if the value of any data source is outside the range of 0 - 100. <Match "value"> Min 0 Max 100 Invert true Satisfy "Any" </Match>
Please keep in mind that ignoring such counters can result in confusing behavior: Counters which hardly ever increase will be zero for long periods of time. If the counter is reset for some reason (machine or service restarted, usually), the graph will be empty (NAN) for a long time. People may not understand why.
The hashing function used tries to distribute the hosts evenly. First, it calculates a 32 bit hash value using the characters of the hostname:
hash_value = 0; for (i = 0; host[i] != 0; i++) hash_value = (hash_value * 251) + host[i];
The constant 251 is a prime number which is supposed to make this hash value more random. The code then checks the group for this host according to the Total and Match arguments:
if ((hash_value % Total) == Match) matches; else does not match;
Please note that when you set Total to two (i. e. you have only two groups), then the least significant bit of the hash value will be the XOR of all least significant bits in the host name. One consequence is that when you have two hosts, "server0.example.com" and "server1.example.com", where the host name differs in one digit only and the digits differ by one, those hosts will never end up in the same group.
Available options:
You can repeat this option to match multiple groups, for example:
Match 3 7 Match 5 7
The above config will divide the data into seven groups and match groups three and five. One use would be to keep every value on two hosts so that if one fails the missing data can later be reconstructed from the second host.
Example:
# Operate on the pre-cache chain, so that ignored values are not even in the # global cache. <Chain "PreCache"> <Rule> <Match "hashed"> # Divide all received hosts in seven groups and accept all hosts in # group three. Match 3 7 </Match> # If matched: Return and continue. Target "return" </Rule> # If not matched: Return and stop. Target "stop" </Chain>
Available options:
Please note that these placeholders are case sensitive!
Example:
<Target "notification"> Message "Oops, the %{type_instance} temperature is currently %{ds:value}!" Severity "WARNING" </Target>
Available options:
You can specify each option multiple times to use multiple regular expressions one after another.
Example:
<Target "replace"> # Replace "example.net" with "example.com" Host "\\<example.net\\>" "example.com" # Strip "www." from hostnames Host "\\<www\\." "" </Target>
Available options:
The following placeholders will be replaced by an appropriate value:
Please note that these placeholders are case sensitive!
Example:
<Target "set"> PluginInstance "coretemp" TypeInstance "core3" </Target>
If you use collectd with an old configuration, i. e. one without a Chain block, it will behave as it used to. This is equivalent to the following configuration:
<Chain "PostCache"> Target "write" </Chain>
If you specify a PostCacheChain, the write target will not be added anywhere and you will have to make sure that it is called where appropriate. We suggest to add the above snippet as default target to your "PostCache" chain.
Ignore all values, where the hostname does not contain a dot, i. e. can't be an FQDN.
<Chain "PreCache"> <Rule "no_fqdn"> <Match "regex"> Host "^[^\.]*$" </Match> Target "stop" </Rule> Target "write" </Chain>
Ignorelists are a generic framework to either ignore some metrics or report specific metircs only. Plugins usually provide one or more options to specify the items (mounts points, devices, ...) and the boolean option "IgnoreSelected".
By default, this option is doing a case-sensitive full-string match. The following config will match "foo", but not "Foo":
Select "foo"
If String starts and ends with "/" (a slash), the string is compiled as a regular expression. For example, so match all item starting with "foo", use could use the following syntax:
Select "/^foo/"
The regular expression is not anchored, i.e. the following config will match "foobar", "barfoo" and "AfooZ":
Select "/foo/"
The Select option may be repeated to select multiple items.
collectd(1), collectd-exec(5), collectd-perl(5), collectd-unixsock(5), types.db(5), hddtemp(8), iptables(8), kstat(3KSTAT), mbmon(1), psql(1), regex(7), rrdtool(1), sensors(1)
Florian Forster <octo@collectd.org>
2019-04-06 | 5.8.1.git |