gres.conf(5) | Slurm Configuration File | gres.conf(5) |
gres.conf - Slurm configuration file for generic resource management.
gres.conf is an ASCII file which describes the configuration of generic resources on each compute node. Each node must contain a gres.conf file if generic resources are to be scheduled by Slurm. The file location can be modified at system build time using the DEFAULT_SLURM_CONF parameter or at execution time by setting the SLURM_CONF environment variable. The file will always be located in the same directory as the slurm.conf file. If generic resource counts are set by the gres plugin function node_config_load(), this file may be optional.
Parameter names are case insensitive. Any text following a "#" in the configuration file is treated as a comment through the end of that line. Changes to the configuration file take effect upon restart of Slurm daemons, daemon receipt of the SIGHUP signal, or execution of the command "scontrol reconfigure" unless otherwise noted.
The overall configuration parameters available include:
NOTE: If your cores contain multiple threads only list the first thread of each core. The logic is such that it uses core instead of thread scheduling per GRES. Also note that since Slurm must be able to perform resource management on heterogeneous clusters having various core ID numbering schemes, an abstract index will be used instead of the physical core index. That abstract id may not correspond to your physical core number. Basically Slurm starts numbering from 0 to n, being 0 the id of the first processing unit (core or thread if HT is enabled) on the first socket, first core and maybe first thread, and then continuing sequentially to the next thread, core, and socket. The numbering generally coincides with the processing unit logical number (PU L#) seen in lstopo output.
##################################################################
# Slurm's Generic Resource (GRES) configuration file
##################################################################
# Configure support for our four GPUs
Name=gpu Type=gtx560 File=/dev/nvidia0 COREs=0,1
Name=gpu Type=gtx560 File=/dev/nvidia1 COREs=0,1
Name=gpu Type=tesla File=/dev/nvidia2 COREs=2,3
Name=gpu Type=tesla File=/dev/nvidia3 COREs=2,3
Name=bandwidth Count=20M
##################################################################
# Slurm's Generic Resource (GRES) configuration file
# Use a single gres.conf file for all compute nodes
##################################################################
NodeName=tux[0-15] Name=gpu File=/dev/nvidia[0-3]
NodeName=tux[16-31] Name=gpu File=/dev/nvidia[0-7]
Copyright (C) 2010 The Regents of the University of California.
Produced at Lawrence Livermore National Laboratory (cf, DISCLAIMER).
Copyright (C) 2010-2014 SchedMD LLC.
This file is part of Slurm, a resource management program. For details, see <https://slurm.schedmd.com/>.
Slurm is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
Slurm Configuration File | July 2018 |