SLURM: A Highly Scalable Resource Manager

SLURM is an open-source resource manager designed for Linux clusters of all sizes. It provides three key functions. First it allocates exclusive and/or non-exclusive access to resources (computer nodes) to users for some duration of time so they can perform work. Second, it provides a framework for starting, executing, and monitoring work (typically a parallel job) on a set of allocated nodes. Finally, it arbitrates contention for resources by managing a queue of pending work.

SLURM's design is very modular with dozens of optional plugins. In its simplest configuration, it can be installed and configured in a couple of minutes (see Caos NSA and Perceus: All-in-one Cluster Software Stack by Jeffrey B. Layton). More complex configurations rely upon a MySQL database for archiving accounting records, managing resource limits by user or bank account, or supporting sophisticated job prioritization algorithms. SLURM also provides an Applications Programming Interface (API) for integration with external schedulers such as The Maui Scheduler or Moab Cluster Suite.

While other resource managers do exist, SLURM is unique in several respects:

SLURM provides resource management on about 1000 computers worldwide, including many of the most powerful computers in the world:

SLURM is actively being developed, distributed and supported by Lawrence Livermore National Laboratory, Hewlett-Packard and Bull. It is also distributed and supported by Adaptive Computing, Infiscale, IBM and Sun Microsystems.

Last modified 25 March 2009

Lawrence Livermore National Laboratory
7000 East Avenue • Livermore, CA 94550
Operated by Lawrence Livermore National Security, LLC, for the Department of Energy's
National Nuclear Security Administration
NNSA logo links to the NNSA Web site Department of Energy logo links to the DOE Web site