Skip to content

Commit

Permalink
Small fix in Perlmutter GPU sbatch script (#5683)
Browse files Browse the repository at this point in the history
Changes in Perlmutter GPU job script:
from `#SBATCH --cpus-per-task=16` to `#SBATCH --cpus-per-task=32`.

This is to request (v)cores in consecutive blocks.

GPU 3 is closest to CPU cores 0-15, 64-79, 
GPU 2 to CPU cores 16-31, 80-95, 
...

If `--cpus-per-task=16`, MPI ranks 0 and 1 are mapped to cores 0 and 8.
If `--cpus-per-task=32`, MPI ranks 0 and 1 are mapped to cores 0 and 16.

Visual representation

![pm_gpu_vcores_mpi](https://github.com/user-attachments/assets/edf0721f-7321-49ab-bf37-4b55a7c422cc)

---------

Co-authored-by: Axel Huebl <[email protected]>
  • Loading branch information
aeriforme and ax3l authored Feb 19, 2025
1 parent d38ebc7 commit 686ef38
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions Tools/machines/perlmutter-nersc/perlmutter_gpu.sbatch
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@
# A100 80GB (256 nodes)
#S BATCH -C gpu&hbm80g
#SBATCH --exclusive
#SBATCH --cpus-per-task=16
#SBATCH --cpus-per-task=32
# ideally single:1, but NERSC cgroups issue
#SBATCH --gpu-bind=none
#SBATCH --ntasks-per-node=4
Expand All @@ -34,7 +34,7 @@ export MPICH_OFI_NIC_POLICY=GPU

# threads for OpenMP and threaded compressors per MPI rank
# note: 16 avoids hyperthreading (32 virtual cores, 16 physical)
export OMP_NUM_THREADS=${SLURM_CPUS_PER_TASK}
export OMP_NUM_THREADS=16

# GPU-aware MPI optimizations
GPU_AWARE_MPI="amrex.use_gpu_aware_mpi=1"
Expand Down

0 comments on commit 686ef38

Please sign in to comment.