I have a strange issue. I have a Matlab mexfunction in which I have used OpenMP directives/functions. Before the beginning of a parallel section (parallel for...), I use commands to set and print the number of threads created:
nP = omp_get_num_procs();
omp_set_num_threads(nP);
mexPrintf("\n Num of threads= %d\n",nP);
.
.
.
#pragma omp parallel for shared(...)
The issue is, at the output, it prints 'Num of threads= 12', but the parallel section which follows does not run on all 12 cores of my machine (but on only 1 core). My program was written long back and I had no such issue earlier (it ran on all 12 cores). Recently, the system got corrupted and OS (Win 7 Pro) was reinstalled with updated version of Matlab 2011b (earlier 2010b). I also installed Visual Studio 2010 Pro.
Is there anything I am missing or overlooking?
Are you calling mex functions inside the
omp parallel for
block?I've had the best luck extracting pointers first sequentially, then processing in parallel, and then loading results into matlab variables sequentially at the end. That way the parallel code is pure C++, no mex functions called (that could wait for a shared lock).
Of course, make sure you're actually compiling with OpenMP enabled... otherwise the directives get ignored and you end up with sequential code.