----赖格英-----

记忆不好了,记录工作中的点点滴滴....

  博客园  :: 首页  :: 新随笔  :: 联系 :: 订阅 订阅  :: 管理

OpenMP for Fortran


  • OpenMP Directive

     

  • Syntax of OpenMP compiler directive for Fortran:
     !$OMP  DirectiveName Optional_CLAUSES...
       ...
       ... Program statements between the !$OMP lines
       ... are executed in parallel by all threads      
       ...
     !$OMP  END DirectiveName 
    

     

  • Program statements between the 2 red lines are executed by multiple threads
  •  



     

     

  • Setting the level of parallellism in OpenMP programs

     

  • The number of threads that will be created to execute parallel sections in an OpenMP program is controlled by the environment variable OMP_NUM_THREADS

     

  • To set this environment variable use:
      export OMP_NUM_THREADS=...            
    
    Example:
    
      export OMP_NUM_THREADS=8
    
  •  



     

     

  • Compiling OpenMP programs
    • Fortran
      • Compile:
          f90 -O -c -xopenmp -stackvar Prog.f90    
        

         

         

      • Link:
          f90 -O -o Executable \
             -xopenmp -stackvar \
             Prog1.o Prog2.o ....
        


     

     

  • Introductory Example
    • Parallel "Hello World" OpenMP program:
         PROGRAM  Main
      
         !$OMP PARALLEL
      
         print *, "Hello World !"                 
      
         !$OMP END PARALLEL
      
         END
      

       

       

    • Example Program: (Demo above code)                                                

       

       

    • Compile with:
          f90 -O
      -xopenmp -stackvar
        openMP01.f90

       

       

    • Run with:
      • export OMP_NUM_THREADS=8
      • a.out

      Make sure you do it on compute.

      You will see "Hello World !!!" printed EIGHT times !!! (Remove the #pragma line and you get ONE line)....

       



     

     

  • Defining shared and private (non-shared) variables in parallel section

     

  • Recall:
    • There is no scopes in Fortran

    Fortran uses option keywords to define private (non-shared) (and shared) variables....


     

     

  • Defining shared and private variables in a PARALLEL section
    • A variable is by default shared among all threads

       

    • A private variable in a PARALLE section must be specified using the option PRIVATE

     

     

  • Fortran example of SHARED variable:
       PROGRAM  Main
       IMPLICIT NONE
    
       integer :: N         ! Shared
    
       N = 1001
       print *, "Before parallel section: N = ", N            
    
       !$OMP PARALLEL
       N = N + 1
       print *, "Inside parallel section: N = ", N
       !$OMP END PARALLEL
    
       print *, "After parallel section: N = ", N
       END
    

     

  • Example Program: (Demo above code)                        
    • Prog file: (Shared variable in OpenMP) --- click here

     

     

  • Compile with:
        f90 -O
    -xopenmp -stackvar
      openMP02a.f90

     

     

  • Run a few times with:
    • export OMP_NUM_THREADS=8
    • a.out

    You should see the value for N at the end is not always 1009, it could be less. This is evidence of asynchronous update.

  •  



     

     

  • Fortran example of NON-SHARED (private) variable:
       PROGRAM  Main
       IMPLICIT NONE
    
       integer :: N         ! Shared
    
       N = 1001
       print *, "Before parallel section: N = ", N
    
       !$OMP PARALLEL PRIVATE(N)
       N = N + 1
       print *, "Inside parallel section: N = ", N
       !$OMP END PARALLEL
    
       print *, "After parallel section: N = ", N
       END
    

     

  • Example Program: (Demo above code)                        
    • Prog file: (Private variable in OpenMP) --- click here

     

     

  • Compile with:
        f90 -O
    -xopenmp -stackvar
      openMP02b.f90

     

     

  • Run a few times with:
    • export OMP_NUM_THREADS=8
    • a.out

     

     

  • Output:
        Before parallel section: N =  1001            
        Inside parallel section: N =  1
        Inside parallel section: N =  1
        Inside parallel section: N =  1
        Inside parallel section: N =  1
        Inside parallel section: N =  1
        Inside parallel section: N =  1
        Inside parallel section: N =  1
        Inside parallel section: N =  1
        After parallel section: N =  1001
    

    Each thread has its own variable N

    This variable N is different from the "program" variable defined in the main program !!!

  •  



     

     

  • OpenMP Support function

     

  • Most useful support functions in OpenMP:
    Function NameEffect
    omp_set_num_threads(int nthread) Set size of thread team
    INTEGER omp_get_num_threads() return size of thread team
    INTEGER omp_get_max_threads() return max size of thread team (typically equal to the number of processors
    INTEGER omp_get_thread_num() return thread ID of the thread that calls this function
    INTEGER omp_get_num_procs() return number of processors
    LOGICAL omp_in_parallel() return TRUE if currently in a PARALLEL segment

     

     

  • Here is a simple OMP program in Fortran:
       PROGRAM  Main
       IMPLICIT NONE
    
       INTEGER :: nthreads, myid
       INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS
    
    
       !$OMP PARALLEL private(nthreads, myid)
    
    
       myid = OMP_GET_THREAD_NUM()
    
       print *, "Hello I am thread ", myid
    
       if (myid == 0) then
          nthreads = OMP_GET_NUM_THREADS()
          print *, "Number of threads = ", nthreads
       end if
    
       !$OMP END PARALLEL
    
       END
    

     

     

  • Example Program: (OpenMP Fortran program) --- click here        

     

  • Compile using the following command:
        f90 -O
    -xopenmp -stackvar
      hello.f90

     

     

  • Run with:
    • export OMP_NUM_THREADS=8
    • a.out

     

     

  • Output:
      Hello I am thread  7
      Hello I am thread  5
      Hello I am thread  1
      Hello I am thread  0
      Hello I am thread  2
      Number of threads =  8
      Hello I am thread  4
      Hello I am thread  3
      Hello I am thread  6
    
  •  



     

     

  • Caveat with Fortran
    • Recall:
      • Array indices in Fortran by default start with 1 (ONE)

         

       

       

    • Observed from "Hello" program:
      • Thread IDs start with 0 (ZERO)

         

       

       

    • Caveat:
      • Use ThreadID+1 as index to an array in Fortran !!!

         



     

     

  • Example OpenMP Program: Find minimum in an array
    • A sequential program in C++ can be found here: ( click here )

       

    • We will write this program using OpenMP in Fortran

       

       

    • Parallel Find Min program in Fortran:
        PROGRAM Min
         IMPLICIT NONE
      
         INTEGER, PARAMETER :: MAX = 10000000
      
         DOUBLE PRECISION, DIMENSION(MAX) :: x
         DOUBLE PRECISION, DIMENSION(10)  :: my_min
         DOUBLE PRECISION :: rmin
      
         INTEGER :: num_threads
         INTEGER :: i, n
         INTEGER :: id, start, stop
      
         ! ===========================================================
         ! Declare the OpenMP functions
         ! ===========================================================     
         INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS
      
      
        ! ===================================
        ! Parallel section: Find local minima
        ! ===================================
      !$OMP  PARALLEL  PRIVATE(i, id, start, stop, num_threads, n)
      
         num_threads = omp_get_num_threads()
         n = MAX/num_threads
      
         id = omp_get_thread_num()
      
         ! ----------------------------------
         ! Find my own starting index
         ! ----------------------------------
         start = id * n + 1          !! Array start at 1
      
         ! ----------------------------------
         ! Find my own stopping index
         ! ----------------------------------
         if ( id <> (num_threads-1) ) then
            stop = start + n
         else
            stop = MAX
         end if
      
         ! ----------------------------------
         ! Find my own min
         ! ----------------------------------
         my_min(id+1) = x(start)
      
         DO i = start+1, stop
            IF ( x(i) < my_min(id+1) ) THEN
               my_min(id+1) = x(i)
            END IF
         END DO
      
      !$OMP END PARALLEL
      
      
        ! ===================================
        ! Find min over the local minima
        ! ===================================
         rmin = my_min(1)
      
         DO i = 2, num_threads
            IF ( rmin < my_min(i) ) THEN
               rmin = my_min(i)
            END IF
         END DO
      
         print *, "min = ", rmin
         END PROGRAM
      

       

       

    • Example Program: (Demo above code)                                                
          f90 -O
      -xopenmp -stackvar
        min-mt1.f90

       

       

    • Run with:
      • export OMP_NUM_THREADS=8
      • a.out


     

     

  • Mutual exclusion synchronization Primitives

     

  • This mutual exclusion effect in Fortran is achieved in OpenMP using the following pragma:
    
       !$OMP CRITICAL
    
           ... statements are guaranteed to be executed
           ,,, by ONE thread at any one time
    
    
       !$OMP END CRITICAL
    
  •  



     

     

  • Example OpenMP program with synchronization: compute Pi

     

  • Example:
      PROGRAM Compute_PI
       IMPLICIT NONE
    
    
       INTEGER, EXTERNAL :: OMP_GET_THREAD_NUM, OMP_GET_NUM_THREADS     
    
       INTEGER           N, i
       INTEGER           id, num_threads
       DOUBLE PRECISION  w, x, sum
       DOUBLE PRECISION  pi, mypi
    
    
       N = 50000000         !! Number of intervals
       w = 1.0d0/N          !! width of each interval
    
       sum = 0.0d0
    
    !$OMP    PARALLEL PRIVATE(i, id, num_threads, x, mypi)
    
       num_threads = omp_get_num_threads()
       id = omp_get_thread_num()
    
       mypi = 0.0d0;
    
       DO i = id,   N-1,   num_threads
         x = w * (i + 0.5d0)
         mypi = mypi + w*f(x)
       END DO
    
    
    !$OMP CRITICAL
       pi = pi + mypi
    !$OMP END CRITICAL
    
    
    !$OMP    END PARALLEL
    
       PRINT *, "Pi = ", pi
    
       END PROGRAM
    
    

     

  • Example Program: (OpenMP compute Pi) --- click here        

     

     

  • Compile with:
        f90 -O
    -xopenmp -stackvar
      openMP_compute_pi2.f90

     

     

  • Run a few times with:
    • export OMP_NUM_THREADS=8
    • a.out
  •  




     

     

  • Parallel For Loop in OpenMP

    The division of labor (splitting the work of a for-loop) of a for-loop can be done in OpenMP through a special Parallel LOOP construct.

     

  • A Parallel Loop construct MUST appear within a Parallel region of the program !

     

  • The syntax of a Parallel LOOP construct in Fortran is:
    
       !$OMP    DO
    
          DO  index = ....
              ....            ! Division of labor is taken care of       
    			  ! by the Fortran compiler
          END DO
    
       !$OMP    END DO
    

     

     

  • The meaning of this Parallel LOOP construct is to distribute the iterations in the for-loop (or do-loop) among the threads.

    Each iteration of the for-loop is executed exactly once by each thread.

    The loop variable used in the Parallel LOOP construct is by default PRIVATE (other variables are still by default SHARED)


     

     

  • Example: compute Pi with parallel DO loop
      PROGRAM Compute_PI
       IMPLICIT NONE
    
       INTEGER           N, i, num_threads
       DOUBLE PRECISION  w, x, sum
       DOUBLE PRECISION  pi, mypi
    
    
       N = 50000000         !! Number of intervals
       w = 1.0d0/N          !! width of each interval
    
       sum = 0.0d0
    
    !$OMP    PARALLEL PRIVATE(x, mypi)
    
       mypi = 0.0d0;
    
    !$OMP    DO
       DO i = 0, N-1                !! Parallel Loop
         x = w * (i + 0.5d0)
         mypi = mypi + w*f(x)
       END DO
    !$OMP    END DO
    
    
    !$OMP CRITICAL
       pi = pi + mypi
    !$OMP END CRITICAL
    
    
    !$OMP    END PARALLEL
    
       PRINT *, "Pi = ", pi
    
       END PROGRAM
    
    

     

  • Example Program: (OpenMP compute Pi) --- click here        

     

     

  • Compile with:
        f90 -O
    -xopenmp -stackvar
      openMP_compute_pi3.f90

     

     

  • Run with:
    • export OMP_NUM_THREADS=8
    • a.out
  •  



     

     

  • Final Notes

     

  • The stack size of each thread can be controlled by setting another environment variable:
      setenv   STACKSIZE    nBytes       
    

     

     

  • For more information on OpenMP, see: http://www.openmp.org





posted on 2014-01-01 12:47  向北方  阅读(5610)  评论(0编辑  收藏  举报