miniconda
environment as it is described in First Steps.conda
environment and install some Python packages:> conda activate
> pip install torch rich
sbatch
script as job.sbatch
:#!/bin/bash
#SBATCH --partition=study
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=1
#SBATCH --time=01:00:00
CUDA_DEVICE=$(echo "$CUDA_VISIBLE_DEVICES," | cut -d',' -f $((SLURM_LOCALID + 1)) );
T_REGEX='^[0-9]$';
if ! [[ "$CUDA_DEVICE" =~ $T_REGEX ]]; then
echo "error no reserved gpu provided"
fi
echo "Process $SLURM_PROCID of Job $SLURM_JOBID with the local id $SLURM_LOCALID using gpu id $CUDA_DEVICE (we may use gpu: $CUDA_VISIBLE_DEVICES on $(hostname))"
echo "computing on $(nvidia-smi --query-gpu=gpu_name --format=csv -i $CUDA_DEVICE | tail -n 1)"
python -c "import torch, os,rich;\
print('Device count ', torch.cuda.device_count());\
print('Is cuda available? ',torch.cuda.is_available());\
rich.print(vars(os.environ))"
sleep 30
echo "done"
sbatch job.sbatch
and print the output file:> sbatch job.sbatch
> cat slurm-*.out
Process 0 of Job 10910 with the local id 0 using gpu id 0 (we may use gpu: 0 on servant-3.GPU.CIT-EC.NET)
computing on NVIDIA GeForce GTX 1080 Ti
Device count 1
Is cuda available? True
{
'encodekey': <function _createenviron.<locals>.encode at 0x7feb3e84c3a0>,
'decodekey': <function _createenviron.<locals>.decode at 0x7feb3e84c430>,
'encodevalue': <function _createenviron.<locals>.encode at 0x7feb3e84c3a0>,
'decodevalue': <function _createenviron.<locals>.decode at 0x7feb3e84c430>,
'_data': {
b'SHELL': b'/bin/bash',
b'SLURM_JOB_USER': b'juser',
b'SLURM_TASKS_PER_NODE': b'1',
b'SLURM_JOB_UID': b'4711',
b'SLURM_TASK_PID': b'3151035',
...
}
The used graphic card as well the compute node is printed into the output file. A full list of all environment variables follows.