TMPDIR_XXX
, where XXX
= LUSTRE
, LOCAL
or SHM
;For the purposes of the calculations, there is a temporary disk space for storing files that are actively used by user programs. Each calculation task is assigned at least one directory for temporary files, which has its own unique path stored in the TMPDIR_XXX
variable, where XXX
depends on the type of temporary disk space (XXX
= LUSTRE
, LOCAL
or SHM
). The user has access only to the directories of their tasks.
Current list of partitions and their available TMPDIR
The current list of TMPDIR spaces for each SLURM partition, along with the default types of temporary disk spaces, is located in in the SLURM partitions in the "Available TMPDIR" columns
WARNING! Automatic removal of the TMPDIR directory upon job finish
An exception is theTMPDIR_LUSTRE=/lustre/tmp/slurm/${SLURM_JOB_ID}
directory for the Lustre temporary filesystem, which remains available to its owner for another 14 days
WARNING! Backup copies of the temporary directoriesare not created
There is no possibility to recover data from$TMPDIR
.** The contents of the TMPDIR are not protected by WCSS and may be deleted or lost without warning. Users should secure their important results on their own
Type of disk space | Description and purpose | Sharing | GRES option | TMPDIR_XXX path | Data retention after job completion | Quota | Maximum capacity |
---|---|---|---|---|---|---|---|
Lustre | A shared file system simultaneously available on all compute servers in a given partition. It is primarily used for multi-node compute jobs where different nodes operate on the same set of files. | multiple nodes | --gres=storage:lustre |
TMPDIR_LUSTRE=/lustre/tmp/slurm/${SLURM_JOB_ID} |
14 days | none | 750TB (partitions bem2-cpu-short and bem2-cpu-normal ) |
Local | ZFS file system created on local NVME disks, accessible within a single compute node. It is primarily used for single-node compute tasks that have large disk space requirements (over 50GB) and perform many IO (Input/Output) operations. | single node | --gres=storage:local:<QUOTA> |
TMPDIR_LOCAL=/mnt/lscratch/slurm/${SLURM_JOB_ID} |
no | yes, set by the user using <QUOTA> (e.g. 100M, 1G, 100G ...) |
7000GB (partitions lem-gpu-short and lem-gpu-normal ) |
SHM | Local file space located in the RAM cache (path /dev/shm ), available within a single compute node. Primarily used for single-node compute jobs with low disk space requirements (below 50GB) and many IO (Input/Output) operations. |
single node | --gres=storage:shm:<QUOTA> |
TMPDIR_SHM=/dev/shm/slurm/${SLURM_JOB_ID} |
no | yes, set by the user via <QUOTA> (e.g. 100M, 1G, 100G ...) |
Equal to the amount of RAM of the given compute node |
Access to TMPDIR_LUSTRE
The/lustre/tmp/slurm/$SLURM_JOB_ID
directories are available only within the SLURM jobs and are not directly visible after logging to ui.wcss.pl.To browse files under TMPDIR one should start an interactive session first, for example, using the
sub-interactive
command
Using SHM storage and memory allocation
When using the SHM storage (option--gres=storage:shm:<QUOTA>
), the amount of<QUOTA>
space needed will be automatically added to the job's memory requirement (one declared using the--mem
option)For example, if a job requires 5GB of memory for running programs (declared via the option
--mem=5G
) and needs additional 50GB of space in SHM (declared via the option--gres=storage:shm:50G
), then the total RAM requirement for this task will be50G+5G=55G
, which will be automatically taken into account when queuing this task (a new value--mem=55G
will be set)
ATTENTION! Separate Lustre TMP file systems
The Bem2 supercomputer (partitionsbem2-cpu-short
andbem2-cpu-normal
) and the LEM supercomputer (partitionslem-gpu-short
andlem-gpu-normal
) have their own temporary Lustre file systems - directories under the path/lustre/tmp
from the Bem2 supercomputer are not available on the Lem supercomputer and vice versa. When using the Lustre file system, please read the terms of use located on the Lustre File system page
The user declares the TMPDIR type when submitting a SLURM job via the so-called GRES resources (Generic Resource Scheduling) using the --gres=storage:<XXX>:<QUANTITY>
option, where <XXX>
is the type of temporary disk space, and <QUANTITY>
declares the maximum possible occupancy of the TMPDIR directory (the so-called quota), for which the prefixes M
(MiB) and G
(GiB) can be used. When using a batch file for sbatch
, the #SBATCH --gres=storage:<XXX>:<QUANTITY>
option should be provided.
It is possible to specify several types of temporary disk spaces (depending on their availability on a given SLURM partition), separating them with a comma, e.g. --gres=storage:lustre,storage:local:10G
. Additionally, if the user uses other types of GRES, e.g. GPU cards, they should also be specified after a comma, e.g. --gres=storage:lustre,storage:local:10G,gpu:hopper:2
.
Automatic creation of TMPDIR directories
Only those temporary disk space directories are created that were declared during the job submission (or the default directories for a given partition). For example, if the user specifies only--gres=storage:lustre
, only the directory under the path stored in the$TMPDIR_LUSTRE
variable will be created, and the$TMPDIR_SHM
and$TMPDIR_LOCAL
directories will not be available!
If the user does not specify the type of TMPDIR when submitting a SLURM task, the default directory type TMPDIR_XXX
will be assigned for the task.
Default TMPDIR directories
The default directory typesTMPDIR_XXX
are assigned depending on the partition and the type of task - single or multi-node. The default directory types for each partition are marked in the Tables in the SLURM partitions in the "Available TMPDIR" column using the characters "*" for single-node tasks and "**" for multi-node tasks
Examples:
bem2-cpu-short
partition without specifying the --gres=storage
option, the job will be allocated Lustre temporary storage by default, i.e. TMPDIR_LUSTRE=/lustre/tmp/slurm/${SLURM_JOB_ID}
.lem-gpu-normal
partition, the job will be allocated local temporary storage by default, i.e. TMPDIR_LOCAL=/mnt/lscratch/slurm/${SLURM_JOB_ID}
.lem-gpu-normal
partition, then by default temporary Lustre disk space will be allocated for such job, i.e. TMPDIR_LUSTRE=/lustre/tmp/slurm/${SLURM_JOB_ID}
.For the duration of the job, appropriate TMPDIR directories are created and the corresponding TMPDIR_XXX
variables are exported, and in the case of selecting several types of TMPDIR - several directories and several TMPDIR_XXX
variables.
Global variable TMPDIR
For the duration of the job,, a global variableTMPDIR
is created, which is set to one of the available variablesTMPDIR_LUSTRE
,TMPDIR_SHM
orTMPDIR_LOCAL
. When the user has declared the use of more than one TMPDIR space, the$TMPDIR
variable is set to the one with the highest preference according to the seriesTMPDIR_LUSTRE < TMPDIR_SHM < TMPDIR_LOCAL
To use the default TMPDIR type for a selected partition, simply do not declare the --gres=storage:<XXX>
variable. For example:
$ srun -p bem2-cpu -N 1 -c 1 -t 1 --mem=1G {...}
For the duration of such a job, the following variables and directories will be created:
TMPDIR_LUSTRE = /lustre/tmp/slurm/${SLURM_JOB_ID}
, with no limit on TMPDIR directory occupancyTMPDIR = ${TMPDIR_LUSTRE}
$ srun -p bem2-cpu -N 1 -c 1 -t 1 --mem=1G --gres=storage:lustre {...}
or in the sbatch
script:
#/bin/bash
#SBATCH -p bem2-cpu
#SBATCH -N 1
#SBATCH -c 1
#SBATCH -t 1
#SBATCH --mem=1G
#SBATCH --gres=storage:lustre
{...}
For the duration of such a job, the following variables and directories will be created:
TMPDIR_LUSTRE = /lustre/tmp/slurm/${SLURM_JOB_ID}
, with no limit on TMPDIR directory occupancyTMPDIR = ${TMPDIR_LUSTRE}
$ srun -p lem-gpu -N 1 -c 1 -t 01:00:00 --mem=1G --gres=storage:local:10GB {...}
or in the sbatch
script:
#/bin/bash
#SBATCH -p lem-gpu
#SBATCH -N 1
#SBATCH -c 1
#SBATCH -t 1
#SBATCH --mem=1G
#SBATCH --gres=storage:local:10G
{...}
For the duration of such a job, the following variables and directories will be created:
TMPDIR_LOCAL = /mnt/lscratch/slurm/${SLURM_JOB_ID}
, with 10GB limit on the TMPDIR occupancyTMPDIR = ${TMPDIR_LOCAL}
$ srun -p bem2-cpu -N 1 -c 1 -t 01:00:00 --mem=1G --gres=storage:shm:100G {...}
or in the sbatch
script:
#/bin/bash
#SBATCH -p bem2-cpu
#SBATCH -N 1
#SBATCH -c 1
#SBATCH -t 1
#SBATCH --mem=1G
#SBATCH --gres=storage:shm:100G
{...}
For the duration of such a job, the following variables and directories will be created:
TMPDIR_SHM = /dev/shm/slurm/${SLURM_JOB_ID}
, with 100GB limit on the TMPDIR occupancyTMPDIR = ${TMPDIR_SHM}
In the case of such job, the total memory requirement will be 1G + 100G = 101G
, which will be taken into account when queuing the task and allocating resources on the compute nodes.
$ srun -p lem-gpu -N 2 -c 1 -t 01:00:00 --mem=1GB --gres=storage:local:10G,storage:lustre {...}
or in the sbatch
script:
#/bin/bash
#SBATCH -N 2
#SBATCH -p lem-gpu
#SBATCH -c 1
#SBATCH -t 1
#SBATCH --mem=1G
#SBATCH --gres=storage:local:10G,storage:lustre
{...}
For the duration of such a job, the following variables and directories will be created:
TMPDIR_LUSTRE = /lustre/tmp/slurm/${SLURM_JOB_ID}
, with no limit on TMPDIR directory occupancyTMPDIR_LOCAL = /mnt/lscratch/slurm/${SLURM_JOB_ID}
, with 10GB limit on the TMPDIR occupancyTMPDIR = ${TMPDIR_LOCAL}