HBOOT(1) | LAM TOOLS | HBOOT(1) |
hboot - Start LAM on the local node.
hboot [-dhstvNV] [-c conf] [-I inet_topo] [-R rtr_topo]
Most MPI users will probably not need to use the hboot command; see lamboot(1).
The hboot tool can be understood as a generic utility that starts multiple processes on the local node, based on information in a process schema. It is not restricted to starting LAM. It is part of the startup sequence preformed by lamboot(1).
A process schema is a description of the processes which constitute the operating system on a given node. Naturally, the process schema used by hboot should be the one that describes LAM on a node. The grammar of the process schema is described in conf(5).
When starting LAM on a remote machine using rsh(1), the open file descriptors of the processes started by hboot must be closed in order for rsh(1) to exit. This is done by using the -s option. The -t option can be used to force a tkill(1) on the machine before attempting to start LAM. This feature is used by lamboot(1) to handle the case where a user might start a machine a second time without using lamwipe(1) to terminate the previous LAM session.
The -I and -R options set their respective variables to the given values. The $inet_topo variable is typically used by the LAM Internet datalinks that communicate with other nodes. The $rtr_topo variable is passed to the LAM router that handles network and topology information. The variables can also be set in the process schema file (see conf(5)) but their values are overridden by the command line options.
When LAM is started, the kernel records all processes that attach to it, including all the processes in the process schema. It is the job of tkill(1) to use this information to remove these processes from the node.
Using ps(1) after hboot will display, among others, the LAM processes that have been started. They may be killed one by one with kill(1), or all at once by killing the LAM kernel process with a HUP signal. The preferred method is to use the LAM tool tkill(1) which should kill them all at once, and also remove the kill file. New users should make liberal use of ps(1) to gain confidence that the system is working properly. In a disaster, ps(1) and kill(1) are your only hope of recovery.
July, 2007 | LAM 7.1.4 |