You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
While playing around with NVBit, I tried a tool I was working on with some multi-gpu code. NVBit seems to hang when multiple Cuda Contexts are being created in parallel.
Here is a sample application, that when instrumented by any nvbit tool on a multi gpu system seems to hang:
#include<iostream>
#include<omp.h>
#include<cuda.h>intmain() {
std::cout << "Started multi context in parallel" << std::endl;
int num_devices = 0;
cudaGetDeviceCount(&num_devices);
#pragma omp parallel num_threads(num_devices)
{
int dev_id = omp_get_thread_num();
std::cout << "Switching to device: " << dev_id << std::endl;
#pragma omp barrier
cudaSetDevice(dev_id);
CUcontext ctx;
cuDevicePrimaryCtxRetain(&ctx, dev_id); // or `cudaFree(0)` just to force the instantiation of the cuda context
#pragma omp barrier
std::cout << "Switched to device: " << dev_id << std::endl;
}
std::cout << "Ended multi context in parallel" << std::endl;
return0;
}
build and run (might require multiple runs):
$ nvcc -lcuda -lcudart -Xcompiler=-fopenmp -O3 -arch=sm_75 pmcs.cu -o pmcs
$ LD_PRELOAD=./tools/noop/noop.so ./pmcs # or any other tool within ./tools could be used
Note: noop.so is a tool which does nothing. The instrumentation functions simply return when entered.
I've tried to investigate where the hanging occurs, and the only thing I was able to trace it to is a cudaDeviceSynchronize in ./core/nvbit_tool.h +82, but I think it is a symptom rather than the cause:
Greetings,
While playing around with NVBit, I tried a tool I was working on with some multi-gpu code. NVBit seems to hang when multiple Cuda Contexts are being created in parallel.
Here is a sample application, that when instrumented by any nvbit tool on a multi gpu system seems to hang:
build and run (might require multiple runs):
I've tried to investigate where the hanging occurs, and the only thing I was able to trace it to is a
cudaDeviceSynchronize
in./core/nvbit_tool.h +82
, but I think it is a symptom rather than the cause:If this problem is known, are there any known workarounds for it?
The text was updated successfully, but these errors were encountered: