LTTNG_HEALTH_CHECK(3) | LTTng Developer Manual | LTTNG_HEALTH_CHECK(3) |
lttng_health_check - Monitor health of the session daemon
#include <lttng/lttng.h> int lttng_health_check(enum lttng_health_component c);
Link with -llttng-ctl.
The lttng_health_check() is used to check the session daemon health for either a specific component c or for all of them. Each component represent a subsystem of the session daemon. Those components are set with health counters that are atomically incremented once reached. An even value indicates progress in the execution of the component. An odd value means that the code has entered a blocking state which is not a poll(7) wait period.
A bad health is defined by a fatal error code path reached or any IPC used in the session daemon that was blocked for more than 20 seconds (default timeout). The condition for this bad health to be detected is that one or many of the counters are odd.
The health check mechanism of the session daemon can only be reached through the health socket which is a different one from the command and the application socket. An isolated thread serves this socket and only computes the health counters across the code when asked by the lttng control library (using this call). This subsystem is highly unlikely to fail due to its simplicity.
The c argument can be one of the following values:
Return 0 if the health is OK, or 1 is it's in a bad state. A return code of -1 indicates that the control library was not able to connect to the session daemon health socket.
For the LTTNG_HEALTH_CONSUMER, you can not know which consumer daemon has failed but only that either the consumer subsystem has failed or that a lttng-consumerd died.
lttng-health-check was originally written by David Goulet and is currently maintained by Jérémie Galarneau <jeremie.galarneau@efficios.com>.
2012-09-19 | LTTng |