Wednesday, April 01, 2009

using openMP in gcc

I'm testing out parallelization using openMP on an 8-core Linux box. To build xspec to support openMP options requires the hmakerc to be edited to add -fopenmp to CFLAGS, FFLAGS, and CXXFLAGS and -lgomp to be added to F77LIBS4C (I added it immediately before -lgfortran).

First test is in sumape.f to parallelize over the individual elements when interpolating the continuum and pseudo-continuum...

C$OMP PARALLEL PRIVATE(ien,limdown,limup,ihigh,energy)

C$OMP DO SCHEDULE(DYNAMIC)

DO ielt = 1, nmelt

...


ENDDO

C$OMP END DO

C$OMP END PARALLEL

The variables defined as private are to avoid threads overwriting variables being set and used by other threads.

Timing tests show that parallelization doesn't win enough in this case. Including the parallel directives slows a newpar down from 0.02 seconds to 0.4 seconds due to the overhead.

3 comments:

Jeremy Sanders said...

Would it be hard to parallelize the different components in a model? This could be a big win for many models, e.g.

gsmooth(apec)+gsmooth(apec)...

Jeremy

Keith Arnaud said...

We considered that. The problem is that openMP uses a common memory model so in your example the calculation of one apec would interfere with the calculation of the other apec. That's why I was looking at specific cases of models where there are loops for which I can use the PRIVATE modifier to specify that a small number of variables are private (and hence duplicated) across threads.

For your case I think we need to follow the MPI approach where separate processes are spawned on each core with their own dedicated memory space. However, this requires more extensive code changes. When we have the Python interface worked out it may be possible to do this.

Keith Arnaud said...

Instead of openMP the pthreads library may be faster but requires substantial rewriting of code.