Testing torch

Testing Torch

  1. Set up a miniconda environment as it is described in First Steps.
  2. Activate the conda environment and install some Python packages:
> conda activate
> pip install torch rich
  1. Save the following sbatch script as job.sbatch:
#!/bin/bash
#SBATCH --partition=study
#SBATCH --gres=gpu:1
#SBATCH --cpus-per-task=1
#SBATCH --time=01:00:00

CUDA_DEVICE=$(echo "$CUDA_VISIBLE_DEVICES," | cut -d',' -f $((SLURM_LOCALID + 1)) );
T_REGEX='^[0-9]$';
if ! [[ "$CUDA_DEVICE" =~ $T_REGEX ]]; then
        echo "error no reserved gpu provided"
fi
echo "Process $SLURM_PROCID of Job $SLURM_JOBID with the local id $SLURM_LOCALID using gpu id $CUDA_DEVICE (we may use gpu: $CUDA_VISIBLE_DEVICES on $(hostname))"
echo "computing on $(nvidia-smi --query-gpu=gpu_name --format=csv -i $CUDA_DEVICE | tail -n 1)"


python -c "import torch, os,rich;\
    print('Device count ', torch.cuda.device_count());\
    print('Is cuda available? ',torch.cuda.is_available());\
    rich.print(vars(os.environ))"
sleep 30

echo "done"
  1. Start the job with sbatch job.sbatch and print the output file:
> sbatch job.sbatch
> cat slurm-*.out
Process 0 of Job 10910 with the local id 0 using gpu id 0 (we may use gpu: 0 on servant-3.GPU.CIT-EC.NET)
computing on NVIDIA GeForce GTX 1080 Ti
Device count  1
Is cuda available?  True
{
    'encodekey': <function _createenviron.<locals>.encode at 0x7feb3e84c3a0>,
    'decodekey': <function _createenviron.<locals>.decode at 0x7feb3e84c430>,
    'encodevalue': <function _createenviron.<locals>.encode at 0x7feb3e84c3a0>,
    'decodevalue': <function _createenviron.<locals>.decode at 0x7feb3e84c430>,
    '_data': {
        b'SHELL': b'/bin/bash',
        b'SLURM_JOB_USER': b'juser',
        b'SLURM_TASKS_PER_NODE': b'1',
        b'SLURM_JOB_UID': b'4711',
        b'SLURM_TASK_PID': b'3151035',
...
}

The used graphic card as well the compute node is printed into the output file. A full list of all environment variables follows.