IBM Books

Language Reference


Introduction

The XLF Version 5.1 compiler provides a compiler option, -qsmp, which instructs the compiler to automatically parallelize Fortran DO loops. This includes both DO loops coded explicitly by the user, and DO loops generated by the compiler for array language constructs, ( WHERE, FORALL, array assignment, ...). However, the compiler will only parallelize loops that are independent, that is, loops whose iterations can be computed independently of any other iteration.

While automatic parallelization will be sufficient for some users, the SMP directives give you the option of providing additional information about the source code to the compiler. The information you pass to the compiler will either be used during automatic parallelization or to specify that certain parts of the program can be parallelized. For example, the PARALLEL DO directive specifies that the DO loop immediately following it should be executed in parallel.

These are the directives which have been added for XLF Version 5.1:

Noncomment Form Directives

Format



>>-directive---------------------------------------------------><
 

directive
is one of the following directives:
EJECT - See EJECT
INCLUDE - See INCLUDE
@PROCESS - See @PROCESS

Rules

Noncomment form directives, are always recognized by the compiler. They cannot be continued.

Additional statements cannot be included on the same line as a directive.

Source format rules concerning white space apply to directive lines.

Comment Form Directives

Format



>>-trigger_head---trigger_constant---directive-----------------><
 

trigger_head
is one of !, *, C or c for fixed source form and ! for free source form.

trigger_constant
is IBM* by default. If the -qsmp compiler option has been specified, IBM*, IBMT, SMP$, $OMP, and IBMP are recognized by default. The trigger_constant can be defined by the user with the -qdirective compiler option. See the User's Guide for more details.

directive
is one of the following directives:
ASSERT - See ASSERT
CNCALL - See CNCALL
CRITICAL - See CRITICAL / END CRITICAL
END CRITICAL - See CRITICAL / END CRITICAL
END PARALLEL SECTIONS - See PARALLEL SECTIONS / END PARALLEL SECTIONS
INDEPENDENT - See INDEPENDENT
PARALLEL DO - See PARALLEL DO
PARALLEL SECTIONS - See PARALLEL SECTIONS / END PARALLEL SECTIONS
PERMUTATION - See PERMUTATION
SCHEDULE - See SCHEDULE
SOURCEFORM - See SOURCEFORM
THREADLOCAL - See THREADLOCAL

Rules

The default value for the trigger_constant is IBM*. Comment form directives using IBM* as the trigger_constant are always recognized by the compiler.

All comment form directives, with the exception of the default, are treated as comments by the compiler unless the appropriate trigger_constant has been defined using the -qdirective compiler option. As a result, code containing these directives can be ported to non-SMP environments.

When compiling using either the xlf_r or xlf90_r invocation commands, the option -qdirective=IBM*:IBMT is turned on by default. If the -qsmp compiler option is used in conjunction with one of these invocation commands, the option -qdirective=IBM*:SMP$:$OMP:IBMP:IBMT is turned on by default. You can specify an alternate trigger_constant with the -qdirective compiler option. See the -qdirective compiler option in the User's Guide for more details.

XLF supports some features of the OpenMP specification. In particular, XLF has partial support for the CRITICAL, END CRITICAL, PARALLEL DO, PARALLEL SECTIONS, SECTION, and END PARALLEL SECTIONS directives. To ensure the greatest portability of code, we recommend that you use these directives whenever possible. These directives should be used with the OpenMP trigger_constant, $OMP; this trigger_constant should not be used with any other directive.

XLF also includes the trigger_constants IBMP and IBMT. IBMP is recognized if you compile using the -qsmp compiler option and is recommended for use with the SCHEDULE directive. IBMT is recognized if you compile using the -qthreaded compiler option (which is the default for the xlf_r or xlf90_r invocation commands) and is recommended for use with the THREADLOCAL directive.

XLF directives include some directives that are in common with those provided by other vendors. If you make use of these directives in your code, you can enable whichever trigger_constant that vendor has selected by specifying the trigger_constant using the -qdirective compiler option. Refer to the -qdirective compiler option in the User's Guide for details on specifying alternative trigger_constants.

A directive can be specified as a free source form or fixed source form comment, depending on the current source form.

The trigger_head follows the rules of comment lines either in Fortran 90 free source form or fixed source form. If the trigger_head is !, it does not have to be in column 1. There must be no blanks between the trigger_head and the trigger_constant.

The directive_trigger, (defined as the trigger_head combined with the trigger_constant, !IBM* for example) and any directive keywords can be specified in uppercase, lowercase, or mixed case.

You can specify inline comments on directive lines.

!SMP$ INDEPENDENT, NEW(i)    !This is a comment

A directive cannot follow another statement or another directive on the same line.

All comment form directives can be continued. A directive cannot be imbedded within a continued statement, nor may a statement be imbedded within a continued directive.

The directive_trigger must be specified on all continuation lines. However, the directive_trigger on a continuation line need not be identical to the directive_trigger used in the continued line. For example:

!SMP$ INDEPENDENT                          &
!IBM*&  , REDUCTION (X)                    &
!SMP$&  , NEW (I)
is equivalent to:
!SMP$ INDEPENDENT, REDUCTION (X), NEW (I)
provided both IBM* and SMP$ are active trigger_constants.

For more information, see "Lines and Source Formats".

Fixed Source Form Rules

If the trigger_head is one of C, c, or *, it must be in column 1.

The maximum length of the trigger_constant in fixed source form is 4 for directives which are continued on one or more lines. This rule applies to the continued lines only and not to the initial line. Otherwise, the maximum length of the trigger_constant is 15. We recommend that initial line triggers should have a maximum length of 4. The maximum allowable length of 15 is permitted for the purposes of backwards compatibility.

The first line of a comment directive must have either white space or a zero in column 6 if the trigger_constant has a length of 4 or less. Otherwise, the character in column 6 is part of the trigger_constant.

The directive_trigger of a continuation line of a comment directive must appear in columns 1-5. Column 6 of a continuation line must have a character that is neither white space nor a zero.

For more information, see "Fixed Source Form".

Free Source Form Rules

The maximum length of the trigger_constant is 15.

An ampersand (&) at the end of a line indicates the directive is continued. When you continue a directive line, a directive_trigger must appear at the beginning of all continuation lines. If you are beginning a continuation line with an ampersand, the directive_trigger must precede the ampersand. For example:

!IBM* INDEPENDENT                 &
!SMP$&  , REDUCTION (X)           &
!IBM*&  , NEW (I)

For more information, see "Fortran 90 Free Source Form".

This chapter describes the following directives:

ASSERT

Purpose

The ASSERT directive provides information to the compiler about the characteristics of DO loops. This assists the compiler in optimizing the source code.

The ASSERT directive only takes effect if either the -qsmp or -qhot compiler option is specified.

Format



>>-ASSERT--(--assertion_list--)--------------------------------><
 

assertion
is ITERCNT(n) or NODEPS. These arguments are not mutually exclusive and both can be specified for the same DO loop. At most one of each argument is permitted for the same DO loop.

ITERCNT(n)
where n specifies the number of iterations for a given DO loop. n must be a positive, scalar, integer initialization expression.

NODEPS
specifies that no loop-carried dependencies exist within a given DO loop.

Rules

The first noncomment line (not including other directives) following the ASSERT directive must be a DO loop. This line cannot be an infinite DO or DO WHILE loop. The ASSERT directive applies only to the DO loop immediately following the directive and not to any nested DO loops.

ITERCNT provides an estimate to the compiler about roughly how many iterations the DO loop will typically execute. There is no requirement that the value be accurate; ITERCNT will only affect performance, never correctness.

When NODEPS is specified, the user is explicitly declaring to the compiler that no loop-carried dependencies exist within the DO loop or any procedures invoked from within the DO loop. A loop-carried dependency involves two iterations within a DO loop interfering with one another. Interference occurs in the following situations:

While it is possible for two complementary ASSERT directives to apply to any given DO loop, an ASSERT directive cannot be followed by a contradicting ASSERT directive for a given DO loop:

   !SMP$ ASSERT (ITERCNT(10))
   !SMP$ INDEPENDENT, REDUCTION (A)
   !SMP$ ASSERT (ITERCNT(20))     ! invalid
         DO I = 1, N
             A(I) = A(I) * I
         END DO
In the example above, the ASSERT(ITERCNT(20)) directive contradicts the ASSERT(ITERCNT(10)) directive and is invalid.

The ASSERT directive overrides the -qassert compiler option for the DO loop on which the ASSERT directive is specified.

Examples

Example 1:

! An example of the ASSERT directive with NODEPS.
         PROGRAM EX1
           INTEGER A(100)
  !SMP$    ASSERT (NODEPS)
           DO I = 1, 100
             A(I) = A(I) * FNC1(I)
           END DO
         END PROGRAM EX1
 
         FUNCTION FNC1(I)
           FNC1 = I * I
         END FUNCTION FNC1

Example 2:

! An example of the ASSERT directive with NODEPS and ITERCNT.
         SUBROUTINE SUB2 (N)
           INTEGER A(N)
    !SMP$  ASSERT (NODEPS,ITERCNT(100))
           DO I = 1, N
             A(I) = A(I) * FNC2(I)
           END DO
         END SUBROUTINE SUB2
 
         FUNCTION FNC2 (I)
           FNC2 = I * I
         END FUNCTION FNC2

Related Information

CNCALL

Purpose

When the CNCALL directive is placed before a DO loop, the user is explicitly declaring to the compiler that no loop-carried dependencies exist within any procedure called from the DO loop.

The CNCALL directive only takes effect if either the -qsmp or -qhot compiler option is specified.

Format



>>-CNCALL------------------------------------------------------><
 

Rules

The first noncomment line (not including other directives) following the CNCALL directive must be a DO loop. This line cannot be an infinite DO or DO WHILE loop. The CNCALL directive applies only to the DO loop immediately following the directive and not to any nested DO loops.

When the CNCALL directive is specified, the user is explicitly declaring to the compiler that no procedures invoked within the DO loop have any loop-carried dependencies. If the DO loop invokes a procedure, separate iterations of the loop must be able to concurrently call upon that procedure. The CNCALL directive does not assert that other operations in the loop do not have dependencies - it is only an assertion about procedure references.

A loop-carried dependency occurs when two iterations within a DO loop interfere with one another. See ASSERT for the definition of interference.

Examples

! An example of CNCALL where the procedure invoked has
! no loop-carried dependency but the code within the  
! DO loop itself has a loop-carried dependency.
         PROGRAM EX3
           INTEGER A(100)
    !SMP$  CNCALL
           DO I = 1, N
             A(I) = A(I) * FNC3(I)
             A(I) = A(I) + A(I-1)    ! This has loop-carried dependency
           END DO
         END PROGRAM EX3
 
         FUNCTION FNC3 (I)
           FNC3 = I * I
         END FUNCTION FNC3

Related Information

CRITICAL / END CRITICAL

Purpose

The CRITICAL construct allows you to define independent blocks of code that are to be executed by at most one thread at a time. The CRITICAL construct includes a CRITICAL directive followed by a block of code and ends with an END CRITICAL directive.

The CRITICAL and END CRITICAL directives only take effect if the -qsmp compiler option is specified.

Format



>>-CRITICAL--+------------------+------------------------------><
             +-(--lock_name--)--+
 

>>-block-------------------------------------------------------><
 

>>-END CRITICAL--+------------------+--------------------------><
                 +-(--lock_name--)--+
 

lock_name
provides a way of distinguishing different CRITICAL constructs of code.

block
represents the block of code to be executed by at most one thread at a time.

Rules

The optional lock_name is a name with global scope. The lock_name must not be used to identify any other global entity in the same executable program.

If the lock_name is specified on the CRITICAL directive, the same lock_name must also be specified on the corresponding END CRITICAL directive.

If the same lock_name is specified for more than one CRITICAL construct, the compiler will allow only one thread to execute any one of these CRITICAL constructs at any one time. If multiple CRITICAL constructs have differing lock_name's, the compiler will allow those constructs to run in parallel.

All CRITICAL constructs which do not have an explicit lock_name specified are protected by the same lock. In other words, these CRITICAL constructs will be assigned the same lock_name by the compiler, thereby ensuring that only one thread enters any unnamed CRITICAL construct at a time.

The lock_name must not be the same as a class 1 local entity as defined under the heading The Scope of a Name.

It is illegal to branch into or out of a CRITICAL construct. The CRITICAL construct must not refer to procedures compiled with the -qsmp=auto compiler option.

The CRITICAL construct may appear anywhere in a program.

The CRITICAL construct must not contain a PARALLEL DO directive or a PARALLEL SECTIONS construct. The CRITICAL construct must not refer to procedures containing either a PARALLEL DO directive or a PARALLEL SECTIONS construct.

Although it is possible to nest a CRITICAL construct within a CRITICAL construct it is not considered advisable as a deadlock situation may result.

Examples

Example 1: Note that in this example the CRITICAL construct appears within a DO loop which has been marked with the PARALLEL DO directive.

      EXPR=0
!SMP$ PARALLEL DO PRIVATE (I)
      DO I = 1, 100
!SMP$   CRITICAL
          EXPR = EXPR + A(I) * I
!SMP$   END CRITICAL
      END DO

Example 2: An example specifying a lock_name on the CRITICAL construct.

!SMP$ PARALLEL DO PRIVATE(T)
      DO I = 1, 100
        T = B(I) * B(I-1)
!SMP$   CRITICAL (LOCK)
          SUM = SUM + T
!SMP$   END CRITICAL (LOCK)
      END DO

Related Information

EJECT

Purpose

EJECT directs the compiler to start a new full page of the source listing. If no source listing has been requested, this directive is ignored.

Format



>>-EJECT-------------------------------------------------------><
 

Rules

The EJECT compiler directive can have an inline comment and a label. However, if a statement label is specified, the compiler discards it. Therefore, you must not reference any label on an EJECT directive. An example of usage would be to put an EJECT directive before the start of an important DO loop that you do not want to split across pages in the listing. If you send the source listing to a printer, the EJECT directive provides a page break.

INCLUDE

Purpose

The INCLUDE compiler directive inserts a specified statement or a group of statements into a program unit.

Format



>>-INCLUDE--+-char_literal_constant-+-+----+-------------------><
            +-(--name--)------------+ +-n--+
 

name, char_literal_constant (delimiters are optional)
specifies filename, the name of an include file

Under the AIX operating system, it need not specify the full path of the desired file, but it must specify the file extension if one exists.

name must contain only characters allowable in the XL Fortran character set. See "Characters" for the character set supported by XL Fortran.

char_literal_constant is a character literal constant.

n
is the value the compiler uses to decide whether to include the file during compilation. It can be any number from 1 through 255, and cannot specify a kind type parameter. If you specify n, the compiler includes the file only if the number appears as a suboption in the -qci (conditional include) compiler option. If you do not specify n, the compiler always includes the file.

A feature called conditional INCLUDE provides a means for selectively activating INCLUDE compiler directives within the Fortran source during compilation. You specify the included files by means of the -qci compiler option.

In fixed source form, the INCLUDE compiler directive must start after column 6, and can have a label.

An inline comment can be added to the line.

Rules

An included file can contain any complete Fortran source statements and compiler directives, including other INCLUDE compiler directives. Recursive INCLUDE compiler directives are not allowed. An END statement can be part of the included group. The first included line must not be a continuation line, nor can the last included line be continued. The statements in the include file are processed with the source form of the including file.

If the SOURCEFORM directive appears in an include file, the source form reverts to that of the including file once processing of the include file is complete. After the inclusion of all groups, the resulting Fortran program must follow all of the Fortran rules for statement order.

For an INCLUDE compiler directive with the left and right parentheses syntax, XL Fortran translates the file name to lowercase unless the -qmixed compiler option is on.

The AIX file system locates the file specified by filename as follows:

Examples

INCLUDE '/u/userid/dc101'     ! full absolute file name specified
INCLUDE '/u/userid/dc102.inc' ! INCLUDE file name has an extension
INCLUDE 'userid/dc103'        ! relative path name specified
INCLUDE (ABCdef)              ! includes file abcdef
INCLUDE '../Abc'              ! includes file Abc from parent directory
                              ! of directory being searched

Related Information

" -qci Option" in the User's Guide

INDEPENDENT

Purpose

The INDEPENDENT directive, if used, must precede a DO loop, FORALL statement, or FORALL construct. This directive specifies that each operation in the FORALL statement or FORALL construct, or each iteration of the DO loop, can be executed in any order without affecting the semantics of the program.

The INDEPENDENT directive only takes effect if either the -qsmp or -qhot compiler option is specified.

Format



                +---------------------------------------------+
                V                                             |
>>-INDEPENDENT----+------------------------------------------++-><
                  +-,--NEW--(--named_variable_list--)--------+
                  +-,--REDUCTION--(--named_variable_list--)--+
 

Rules

The first noncomment line (not including other directives) following the INDEPENDENT directive must be a DO loop, FORALL statement, or the first statement of a FORALL construct. This line cannot be an infinite DO or DO WHILE loop. The INDEPENDENT directive applies only to the DO loop immediately following the directive and not to any nested DO loops.

An INDEPENDENT directive can have at most one NEW clause and at most one REDUCTION clause.

If the directive applies to a DO loop, no iteration of the loop can interfere with any other iteration. Interference occurs in the following situations:

If the NEW clause is specified, the directive must apply to a DO loop. The NEW clause modifies the directive and any surrounding INDEPENDENT directives by accepting any assertions made by such directive(s) as true even if the variables specified in the NEW clause are modified by each iteration of the loop. Variables specified in the NEW clause behave as if they are private to the body of the DO loop. That is, the program is unaffected if these variables (and any variables associated with them) were to become undefined both before and after each iteration of the loop.

Any variable specified in the NEW clause or REDUCTION clause must not:

For FORALL, no combination of index values affected by the INDEPENDENT directive assigns to an atomic storage unit that is required by another combination. If a DO loop, FORALL statement, or FORALL construct have the same body and each are preceded by an INDEPENDENT directive, they behave the same way.

The REDUCTION clause asserts that the named variables are updated within REDUCTION statements in the INDEPENDENT loop. Furthermore, the intermediate values of the REDUCTION variables are not used within the loop, other than in the updates themselves. Thus, the value of the REDUCTION variable after the loop is the result of a reduction tree.

If the REDUCTION clause is specified, the directive must apply to a DO loop. The only reference to a REDUCTION variable in an INDEPENDENT DO loop must be within a reduction statement.

A REDUCTION variable must be of intrinsic type but must not be of type character. A REDUCTION variable must not be an allocatable array.

A REDUCTION variable must not occur in:

A reduction statement is:




>>---reduction_var_ref = expr---reduction_op---reduction_var_ref---><


>>---reduction_var_ref = reduction_var_ref---reduction_op---expr---><


>>-reduction_var_ref = reduction_function--(expr,--reduction_var_ref)-><


>>-reduction_var_ref = reduction_function--(reduction_var_ref,--expr)-><

where:

reduction_var_ref
is a variable or subobject of a variable that appeared in a REDUCTION clause

reduction_op
is one of: +, -, *, .AND., .OR., .EQV., .NEQV., .XOR.

reduction_function
is one of: MAX, MIN, IAND, IOR, IEOR

The following rules apply to reduction statements:

  1. A reduction statement is an assignment statement that occurs in the range of an INDEPENDENT DO loop. A variable in the REDUCTION clause must only occur in a reduction statement within the INDEPENDENT DO loop.

  2. The two reduction_var_refs that appear in a reduction statement must be lexically identical.

  3. The syntax of the INDEPENDENT directive does not allow an array element or array section to be designated as a REDUCTION variable in the REDUCTION clause. Although such a subobject may occur in a reduction statement, it is the entire array that is treated as a REDUCTION variable.

  4. The following form of the reduction statement is not allowed:



    >>-reduction_var_ref-- = --expr-- - --reduction_var_ref--------><
     
    

Examples

Example 1:

       INTEGER A(10),B(10,12),F
!IBM*  INDEPENDENT                    ! The NEW clause cannot be
       FORALL (I=1:9:2) A(I)=A(I+1)   ! specified before a FORALL
!IBM*  INDEPENDENT, NEW(J)
       DO M=1,10
         J=F(M)                       ! 'J' is used as a scratch
         A(M)=J*J                     ! variable in the loop
!IBM*    INDEPENDENT, NEW(N)
         DO N=1,12                    ! The first executable statement
           B(M,N)=M+N*N               ! following the INDEPENDENT must
         END DO                       ! be either a DO or FORALL
       END DO
       END

Example 2:

       X=0
!IBM*  INDEPENDENT, REDUCTION(X)
       DO J = 1, M
         X = X + J**2
       END DO

Example 3:

       INTEGER A(100), B(100, 100)
!SMP$  INDEPENDENT, REDUCTION(A), NEW(J)   ! Example showing an array used
       DO I=1,100                          ! for a reduction variable
         DO J=1, 100
           A(I)=A(I)+B(J, I)
         END DO
       END DO

Related Information

PARALLEL DO

Purpose

The PARALLEL DO directive provides a means of specifying which loops should be parallelized by the compiler.

The PARALLEL DO directive only takes effect if the -qsmp compiler option is specified.

Format



                +-+--+----------------------------+
                | +--+                            |
                V                                 |
>>-PARALLEL DO----+-----------------------------+-+------------><
                  +-+---+---parallel_do_clause--+
                    +-,-+
 

where parallel_do_clause is:



>>-+-IF--(--scalar_logical_expr--)-----------------------+-----><
   +-LASTPRIVATE--(--named_variable_list--)--------------+
   +-PRIVATE--(--named_variable_list--)------------------+
   +-REDUCTION--(-+----------+---named_variable_list--)--+
   |              +-op_fnc :-+                           |
   +-SCHEDULE--(--sched_type-+----+---)------------------+
   |                         +-,n-+                      |
   +-SHARED--(--named_variable_list--)-------------------+
 

IF(scalar_logical_expr)
Performs a run-time test to choose between executing the loop in serial or parallel. If scalar_logical_expr is true, then the iterations of the DO loop are executed in parallel. Otherwise, the iterations of the DO loop are executed serially.

LASTPRIVATE(named_variable_list)
Each variable in named_variable_list is a PRIVATE variable whose last iteration value is used outside the loop.

PRIVATE(named_variable_list)
Each iteration of the loop has its own uninitialized local copy of the variables in named_variable_list.

REDUCTION([op_fnc :] named_variable_list)

op_fnc
is a reduction_op or a reduction_function.
op_fnc is specified for the REDUCTION clause, each variable in the named_variable_list must be scalar. Variables in named_variable_list can only occur in reduction statements. See the syntax of reduction statements for more information. op_fnc must be specified if the directive uses the trigger_constant $OMP.

SCHEDULE(sched_type[,n])

sched_type
is one of AFFINITY, DYNAMIC, GUIDED, RUNTIME, or STATIC

n
must be a positive scalar integer expression; it must not be specified for the RUNTIME sched_type. See SCHEDULE for definitions of these scheduling types. If you are using the trigger_constant $OMP, the scheduling type AFFINITY should not be specified.

SHARED(named_variable_list)
All iterations of the loop use the same copy of the variables specified in named_variable_list.

Definitions:

lexical extent
The lexical extent of a PARALLEL DO directive includes the corresponding DO loop and the code that is enclosed in this DO loop.

dynamic extent
The dynamic extent of a PARALLEL DO directive includes the lexical extent and subprograms called from within the lexical extent.

Rules

The first noncomment line (not including other directives) following the PARALLEL DO directive must be a DO loop. This line cannot be an infinite DO or DO WHILE loop. The PARALLEL DO directive applies only to the DO loop immediately following the directive and not to any nested DO loops.

No iteration of the DO loop can interfere with any other iteration, unless the interference occurs within a CRITICAL construct. See the definition of interference outside a CRITICAL construct, for more information.

The PARALLEL DO directive must not be followed by another PARALLEL DO directive. Only one PARALLEL DO directive may be specified for a given DO loop.

The PARALLEL DO directive must not appear with the INDEPENDENT directive for a given DO loop.
Note:The INDEPENDENT directive allows you to keep your code common with HPF implementations. The PARALLEL DO directive should be used for maximum portability across multiple vendors. The PARALLEL DO directive is a prescriptive directive while the INDEPENDENT directive is an assertion about the characteristics of the loop. See the INDEPENDENT directive for more information.

A variable should be specified with the PRIVATE attribute if its value is used during the calculation of a single iteration of a loop, and that value is not dependent on any other iteration of the loop. Copies of the PRIVATE variable exist, locally, on each thread. Each iteration of the loop receives its own uninitialized copy of the PRIVATE variable. A PRIVATE variable has an undefined value or association status on entry to, and exit from, the loop. All DO loop iteration variables within the dynamic extent of the PARALLEL DO directive are given the PRIVATE attribute by default.

Local variables without the SAVE or STATIC attributes in referenced subprograms in the dynamic extent of a PARALLEL DO directive have an implicit PRIVATE attribute. Common blocks and modules in referenced subprograms in the dynamic extent of a PARALLEL DO directive have an implicit SHARED attribute, unless they are THREADLOCAL common blocks.

If one of the entities involved in an asynchronous I/O operation is a PRIVATE variable or a subobject of a PRIVATE variable, the matching WAIT statement must be executed before the end of the iteration.

If there is a call to an MPI routine which does non-blocking communication in a parallel loop, no arguments to the MPI routine should be PRIVATE or LASTPRIVATE.

The LASTPRIVATE clause functions in a manner similar to the PRIVATE clause and should be specified for variables that match the same criteria. The exception is the status of the variable upon exit from the loop. The compiler determines the value of the variable at the final iteration, and takes a copy of that value. The copy of the value is then saved in the named variable for use after the loop. A LASTPRIVATE variable is undefined on entry into the loop. If the last iteration does not define a value then the LASTPRIVATE variable is undefined after the loop.

A variable which appears in the PRIVATE or LASTPRIVATE clause of an inner DO loop must also appear in the PRIVATE or LASTPRIVATE clause of all enclosing DO loops which have the PARALLEL DO directive specified, and of all enclosing PARALLEL SECTIONS constructs. This includes both the lexical extent and the dynamic extent of the PARALLEL DO directive and the PARALLEL SECTIONS construct.

The REDUCTION clause specifies named variables that appear in reduction operations. The compiler will maintain local copies of such variables, but will combine them at loop exit. The intermediate values of the REDUCTION variables are combined in random order, dependent on which threads finish their calculations first. There is, therefore, no guarantee that bit-identical results will be obtained from one parallel run to another, even if the parallel runs use the same number of threads and the same scheduling type and chunk size.

The SHARED clause specifies variables that must be available to all threads. If a variable is specified as SHARED, the user is stating that all iterations of the loop can safely share a single copy of the variable. You should specify a variable as SHARED when:

If neither condition is satisfied then a variable may be marked SHARED only if it is used within a CRITICAL construct, (see CRITICAL / END CRITICAL), and the updating of, or reference to, the variable is not dependent on the order in which the iterations of the loop are executed. All variables, with the exception of loop-iteration variables, are SHARED by default.

If a SHARED variable, subobject of a SHARED variable, or an object associated with a SHARED variable or subobject of a SHARED variable appears as an actual argument in a reference to a non-intrinsic procedure:

unless the procedure reference appears in a CRITICAL construct.

While a DO loop is executed, a variable or subobject of a variable must not be referenced, become defined, become undefined, have its association status or allocation status changed, or appear as an actual argument:

The IF clause may appear at most once in a PARALLEL DO directive.

By default, a nested parallel loop is serialized, regardless of the setting of the IF clause. You can change this default by using the -qsmp=nested_par compiler option.

The SCHEDULE clause may appear at most once in a PARALLEL DO directive.

A variable name must not appear:

A variable in the PRIVATE clause must not:

Note that a variable or a subobject of a variable in the named_variable_list of the PRIVATE or LASTPRIVATE clause may have the POINTER attribute. Such a pointer has undefined association status on entry to the DO loop and undefined association status on exit from every iteration of the DO loop, except that it will retain its association status at the end of the last iteration if the variable appeared in the LASTPRIVATE clause. Also note that a variable name in the named_variable_list of the PRIVATE clause may be an allocatable array. It must not be allocated on initial entry to the DO loop and the user must allocate and deallocate the array in every iteration of the DO loop.

A variable in the LASTPRIVATE clause must not:

If the last iteration of the DO loop does not define a LASTPRIVATE variable, the variable is undefined after the loop.

A variable in the REDUCTION clause must be of intrinsic type. A variable in the REDUCTION clause, or any element thereof, must not:

A variable which appears in the REDUCTION clause of an inner DO loop must also appear in the PRIVATE, LASTPRIVATE, or REDUCTION clause of all enclosing DO loops which have the PARALLEL DO directive specified, and of all enclosing PARALLEL SECTIONS constructs. This includes both the lexical extent and the dynamic extent of the PARALLEL DO directive and the PARALLEL SECTIONS construct. If the REDUCTION variable of an inner DO loop appears in the PRIVATE or LASTPRIVATE clause of an enclosing DO loop or PARALLEL SECTIONS construct, the variable must be initialized before the inner DO loop.

A REDUCTION variable must not appear in either a PRIVATE or LASTPRIVATE clause in the body of the following DO loop.

A variable that appeared in the REDUCTION clause of an INDEPENDENT directive of an enclosing DO loop must not also appear in the named_variable_list of the PRIVATE or LASTPRIVATE clause.

A variable in the SHARED clause must not:

Examples

Example 1: A valid example with the LASTPRIVATE clause.

!SMP$ PARALLEL DO PRIVATE(I), LASTPRIVATE (X)
      DO I = 1,10
        X = I * I
        A(I) = X * B(I)
      END DO
      PRINT *, X                     ! X has the value 100

Example 2: A valid example with the REDUCTION clause.

!SMP$ PARALLEL DO PRIVATE(I), REDUCTION(MYSUM)
      DO I = 1, 10
        MYSUM = MYSUM + IARR(I)
      END DO

Example 3: A valid example where a variable marked SHARED is accessed by more than one thread but is used only in a CRITICAL construct.

!SMP$ PARALLEL DO SHARED (X)
      DO I = 1, 10
        A(I) = A(I) * I
!SMP$   CRITICAL
          X = X + A(I)
!SMP$   END CRITICAL
      END DO

Example 4: An invalid example because the variable A appears in both the PRIVATE and the SHARED clauses.

!SMP$ PARALLEL DO PRIVATE(A), SHARED(A)
      DO I = 1,1000
        A(I) = I ** I
      END DO

Example 5: An invalid example because the SCHEDULE clause appears more than once.

!SMP$ PARALLEL DO SCHEDULE(GUIDED), SCHEDULE(STATIC, 100)
      DO I = 1, 1000
        A(I) = B(I) ** I
      END DO

Example 6: An invalid example because the REDUCTION clause specifies the division arithmetic operator as the reduction_op.

!SMP$ PARALLEL DO REDUCTION(/ : X)
       DO I = 1, 1000
        X = X / I
      END DO

Related Information

PARALLEL SECTIONS / END PARALLEL SECTIONS

Purpose

The PARALLEL SECTIONS construct allows you to define independent blocks of code which the compiler can execute concurrently. The PARALLEL SECTIONS construct includes a PARALLEL SECTIONS directive followed by one or more blocks of code delimited by the SECTION directive, and ends with an END PARALLEL SECTIONS directive.

The PARALLEL SECTIONS, SECTION and END PARALLEL SECTIONS directives only take effect if the -qsmp compiler option is specified.

Format



                      +-+--+----------------------------------+
                      | +--+                                  |
                      V                                       |
>>-PARALLEL SECTIONS----+-----------------------------------+-+-><
                        +-+----+--parallel_sections_clause--+
                          +-,--+
 

                        +-+--+----------------+
                        | +--+                |
                        V                     |
>>-+---------+---block----+-----------------+-+----------------><
   +-SECTION-+            +-SECTION--block--+
 

>>-END PARALLEL SECTIONS---------------------------------------><
 

where parallel_sections_clause is:



>>-+-IF--(--scalar_logical_expr--)-----------------------+-----><
   +-PRIVATE--(--named_variable_list--)------------------+
   +-REDUCTION--(-+----------+---named_variable_list--)--+
   |              +-op_fnc :-+                           |
   +-SHARED--(--named_variable_list--)-------------------+
 

IF(scalar_logical_expr)
Performs a run-time test to choose between executing the sections in serial or parallel. If scalar_logical_expr is true, then the sections are executed in parallel. Otherwise, the sections are executed serially.

PRIVATE(named_variable_list)
Each section has its own uninitialized local copy of the variables in named_variable_list.

REDUCTION([op_fnc :] named_variable_list)

op_fnc
is a reduction_op or a reduction_function
named_variable_list can only occur in reduction statements. See the syntax of reduction statements for more information. If op_fnc is specified for the REDUCTION clause, each variable in the named_variable_list must be a scalar. op_fnc must be specified if the directive uses the trigger_constant $OMP.

SHARED(named_variable_list)
All sections use the same copy of the variables specified in the named_variable_list.

Definitions:

lexical extent
The lexical extent of a PARALLEL SECTIONS construct includes the corresponding PARALLEL SECTIONS construct and the code that is enclosed in this construct.

dynamic extent
The dynamic extent of a PARALLEL SECTIONS construct includes the lexical extent and subprograms called from within the lexical extent.

Rules

The PARALLEL SECTIONS construct includes, as stated in the syntax above, the delimiting directives and the blocks of code they enclose. The rules below also refer to sections. A section is defined as the block of code within the delimiting directives.

The SECTION directive marks the beginning of a block of code. At least one SECTION and its block of code must appear within the PARALLEL SECTIONS construct. Note, however, that the SECTION directive does not have to be specified for the first section. The end of a block is delimited by either another SECTION directive or by the END PARALLEL SECTIONS directive.

The PARALLEL SECTIONS construct is used to specify parallel execution of the identified sections of code. There is no assumption as to the order in which sections are executed. Each section must not interfere with any other section in the construct unless the interference occurs within a CRITICAL construct. See the definition of interference outside a CRITICAL construct, for more information.

It is illegal to branch into or out of any block of code defined by the PARALLEL SECTIONS construct.

Within a PARALLEL SECTIONS construct, variables not appearing in the PRIVATE clause are assumed to be SHARED by default.

A variable name must not appear:

While a PARALLEL SECTIONS construct is executing, a variable or subobject of a variable must not be referenced, become defined, become undefined, have its association status or allocation status changed, or appear as an actual argument:

The IF clause may appear at most once in the a PARALLEL SECTIONS directive.

By default, a nested parallel loop is serialized, regardless of the setting of the IF clause. You can change this default by using the -qsmp=nested_par compiler option.

A variable should be specified with the PRIVATE attribute if it is referenced within multiple sections, defined before it is used within a section, and its value is not used after the section ends. Copies of the PRIVATE variable exist, locally, on each thread. Each section receives its own uninitialized copy of the PRIVATE variable. A PRIVATE variable has an undefined value or association status on entry to, and exit from, the PARALLEL SECTIONS construct. All iteration variables within the dynamic extent of the PARALLEL SECTIONS construct are given the PRIVATE attribute by default.

Local variables without the SAVE or STATIC attributes in referenced subprograms in the dynamic extent of a PARALLEL SECTIONS construct have an implicit PRIVATE attribute. Common blocks and modules in referenced subprograms in the dynamic extent of a PARALLEL SECTIONS construct have an implicit SHARED attribute, unless they are THREADLOCAL common blocks.

If there is a call to an MPI routine which does non-blocking communication in a PARALLEL SECTIONS construct, no arguments to the MPI routine should be PRIVATE.

If one of the entities involved in an asynchronous I/O operation is a PRIVATE variable or a subobject of a PRIVATE variable, the matching WAIT statement must be executed before the end of the section.

A variable in the PRIVATE clause must not:

Note that a variable in the named_variable_list of the PRIVATE clause may have the POINTER attribute. Such a pointer has undefined association status on entry to the PARALLEL SECTIONS construct and undefined association status on exit from every section of the PARALLEL SECTIONS construct. Also note that a variable name in the named_variable_list of the PRIVATE clause may be an allocatable array. It must not be allocated on initial entry to the PARALLEL SECTIONS construct and the user must allocate and deallocate the array in every section.

A variable which appears in the PRIVATE clause of an inner PARALLEL SECTIONS construct must also appear in the PRIVATE or LASTPRIVATE clause of all enclosing DO loops which have the PARALLEL DO directive and all enclosing PARALLEL SECTIONS constructs. This includes both the lexical extent and the dynamic extent of the PARALLEL DO directive and the PARALLEL SECTIONS construct.

In a PARALLEL SECTIONS construct, a variable which appeared in the REDUCTION clause of an INDEPENDENT directive or the PARALLEL DO directive of an enclosing DO loop must not also appear in the named_variable_list of the PRIVATE clause.

The REDUCTION clause specifies named variables that appear in reduction operations. The compiler will maintain local copies of such variables, but will combine them upon exit from the construct. The intermediate values of the REDUCTION variables are combined in random order, dependent on which threads finish their calculations first. There is, therefore, no guarantee that bit-identical results will be obtained from one parallel run to another, even if the parallel runs use the same number of threads and the same scheduling type and chunk size.

A variable in the REDUCTION clause must be of intrinsic type. A variable in the REDUCTION clause, or any element thereof, must not:

A variable which appears in the REDUCTION clause of an inner PARALLEL SECTIONS construct must also appear in the PRIVATE, LASTPRIVATE, or REDUCTION clause of all enclosing DO loops which have the PARALLEL DO directive and all enclosing PARALLEL SECTIONS constructs. This includes both the lexical extent and the dynamic extent of the PARALLEL DO directive and the PARALLEL SECTIONS construct. If the REDUCTION variable of the inner PARALLEL SECTIONS construct appears in the PRIVATE clause of an enclosing DO loop or PARALLEL SECTIONS construct, the variable must be initialized before the inner PARALLEL SECTIONS construct.

A REDUCTION variable must not appear in a PRIVATE or LASTPRIVATE clause in the body of the PARALLEL SECTIONS construct.

The SHARED clause specifies variables that must be available to all threads. If a variable is specified as SHARED, the user is stating that all sections can safely share a single copy of the variable. You should specify a variable as SHARED when:

If neither condition is satisfied then a variable may be marked SHARED only if it is used within a CRITICAL construct, (see CRITICAL / END CRITICAL), and the updating of, or reference to, the variable is not dependent on the order in which the sections are executed. All variables with the exception of loop-iteration variables, are SHARED by default.

If a SHARED variable, subobject of a SHARED variable, or an object associated with a SHARED variable or subobject of a SHARED variable appears as an actual argument in a reference to a non-intrinsic procedure:

unless the procedure reference appears in a CRITICAL construct.

A variable in the SHARED clause must not:

The PARALLEL SECTIONS construct must not appear within a CRITICAL construct.

Examples

Example 1: In this example, note that a section of code need not contain a DO loop.

!SMP$ PARALLEL SECTIONS
!SMP$   SECTION
          DO I = 1, 10
            C(I) = MAX(A(I), A(I+1))
          END DO
!SMP$   SECTION
          W = U + V
          Z = X + Y
!SMP$ END PARALLEL SECTIONS

Example 2: In this example the index variable I is declared as PRIVATE. Note also that the first optional SECTION directive has been omitted.

!SMP$ PARALLEL SECTIONS PRIVATE(I)
          DO I = 1, 100
            A(I) = A(I) * I
          END DO
!SMP$   SECTION
          CALL NORMALIZE (B)
          DO I = 1, 100
            B(I) = B(I) + 1.0
          END DO
!SMP$   SECTION
          DO I = 1, 100
            C(I) = C(I) * C(I)
          END DO
!SMP$ END PARALLEL SECTIONS

Example 3: This example is invalid because there is a data dependency for the variable C across sections.

!SMP$ PARALLEL SECTIONS
!SMP$   SECTION
          DO I = 1, 10
            C(I) = C(I) * I
          END DO
!SMP$   SECTION
          DO K = 1, 10
            D(K) = C(K) + K
          END DO
!SMP$ END PARALLEL SECTIONS

Related Information

PERMUTATION

Purpose

The PERMUTATION directive specifies that the elements of each array listed in the integer_array_name_list have no repeated values. This directive is useful when array elements are used as subscripts for other array references.

The PERMUTATION directive only takes effect if either the -qsmp or -qhot compiler option is specified.

Format



>>-PERMUTATION--(--integer_array_name_list--)------------------><
 

integer_array_name
is an integer array with no repeated values.

Rules

The first noncomment line (not including other directives) following the PERMUTATION directive must be a DO loop. This line cannot be an infinite DO or DO WHILE loop. The PERMUTATION directive applies only to the DO loop immediately following the directive and not to any nested DO loops.

Examples

       PROGRAM EX3
         INTEGER A(100), B(100)
         !SMP$  PERMUTATION (A)
         DO I = 1, 100
           A(I) = I
           B(A(I)) = B(A(I)) + A(I)
         END DO
       END PROGRAM EX3

Related Information

@PROCESS

Purpose

You can specify compiler options to affect an individual compilation unit by putting the @PROCESS compiler directive in the source file. It can override options specified in the configuration file, in the default settings, or on the command line.

Format



             +-+---+-----------------------------+
             | +-,-+                             |
             V                                   |
>>-@PROCESS----option--+-----------------------+-+-------------><
                       +-(--suboption_list--)--+
 

option
is the name of a compiler option, without the -q

suboption
is a suboption of a compiler option

Rules

In fixed source form, @PROCESS can start in column 1 or after column 6. In free source form, the @PROCESS compiler directive can start in any column.

You cannot place a statement label or inline comment on the same line as an @PROCESS compiler directive.

By default, option settings you designate with the @PROCESS compiler directive are effective only for the compilation unit in which the statement appears. If the file has more than one compilation unit, the option setting is reset to its original state before the next unit is compiled. Trigger constants specified by the DIRECTIVE option are in effect until the end of the file (or until NODIRECTIVE is processed).

The @PROCESS compiler directive must usually appear before the first statement of a compilation unit. The only exceptions are when specifying SOURCE and NOSOURCE; you can put them in @PROCESS directives anywhere in the compilation unit.

Related Information

See the User's Guide for details on compiler options.

SCHEDULE

Purpose

The SCHEDULE directive allows the user to specify the chunking method for parallelization. Work is assigned to threads in a different manner depending on the scheduling type or chunk size used.

The SCHEDULE directive only takes effect if the -qsmp compiler option is specified.

Format



>>-SCHEDULE--(--sched_type--+-------+--)-----------------------><
                            +-,--n--+
 

n
n must be a positive, specification expression. n must not be specified for the sched_type RUNTIME.

sched_type
is AFFINITY, DYNAMIC, GUIDED, RUNTIME or STATIC

Definitions:

CEILING
is an intrinsic procedure which returns the least integer greater than or equal to its argument. For more information, see CEILING (A).

number_of_iterations
is the number of iterations in the loop to be parallelized.

number_of_threads
is the number of threads used by the program.

AFFINITY
The iterations of a loop are initially divided into number_of_threads partitions, containing
CEILING(number_of_iterations / number_of_threads)
iterations. Each partition is initially assigned to a thread, and is then further subdivided into chunks containing n iterations, if n has been specified. If n has not been specified, then the chunks consist of
CEILING(number_of_iterations_remaining_in_partition / 2)
loop iterations.

When a thread becomes free, it takes the next chunk from its initially assigned partition. If there are no more chunks in that partition, then the thread takes the next available chunk from a partition initially assigned to another thread.

The work in a partition initially assigned to a sleeping thread will be completed by threads which are active.

DYNAMIC
If n has been specified, the iterations of a loop are divided into chunks containing n iterations each. If n has not been specified, then the chunks consist of
CEILING(number_of_iterations / number_of_threads)
iterations.

Threads are assigned these chunks on a "first-come, first-do" basis. Chunks of the remaining work are assigned to available threads, until all work has been assigned.

If a thread is asleep, its assigned work will be taken over by an active thread, once that thread becomes available.

GUIDED
If n has been specified, the iterations of a loop are divided into progressively smaller chunks until a minimum chunk size of n loop iterations is reached. If n has not been specified, the default value for n is 1 iteration.

The first chunk contains

CEILING(number_of_iterations / number_of_threads)
iterations. Subsequent chunks consist of
CEILING(number_of_iterations_remaining / number_of_threads)
iterations. Available threads are assigned chunks on a "first-come, first-do" basis. Chunks of the remaining work are assigned to available threads, until all work has been assigned.

If a thread is asleep, its assigned work will be taken over by an active thread, once that thread becomes available.

RUNTIME
Determine the scheduling type at run time.

At run time, the scheduling type can be specified using the environment variable XLSMPOPTS. If no scheduling type is specified using that variable, then the default scheduling type used is STATIC.

STATIC
If n has been specified, the iterations of a loop are divided into chunks containing n iterations. Each thread is assigned chunks in a "round robin" fashion. This is known as block cyclic scheduling. If the value of n is 1, then the scheduling type is specifically referred to as cyclic scheduling.

If n has not been specified, the chunks will contain

CEILING(number_of_iterations / number_of_threads)
iterations. Each thread is assigned one of these chunks. This is known as block scheduling.

If a thread is asleep and it has been assigned work, it will be awakened so that it may complete its work.

STATIC is the default scheduling type if the user has not specified any scheduling type at compile-time or run time.

Rules

The SCHEDULE directive must appear in the specification part of a scoping unit.

Only one SCHEDULE directive may appear in the specification part of a scoping unit.

The SCHEDULE directive applies to

Any dummy arguments appearing or referenced in the specification expression for the chunk size n must also appear in the SUBROUTINE or FUNCTION statement and in all ENTRY statements appearing in the given subprogram.

If the specified chunk size n is greater than the number of iterations, the loop will not be parallelized and will execute on a single thread.

If you specify more than one method of determining the chunking algorithm, the compiler will follow, in order of precedence:

  1. SCHEDULE directive

  2. schedule suboption to the -qsmp compiler option. See "-qsmp Option" in the User's Guide

  3. XLSMPOPTS run-time option. See "XLSMPOPTS" in the User's Guide

  4. run-time default (that is, STATIC)

Examples

Example 1. Given the following information:

number of iterations = 1000
number of threads = 4
and using the GUIDED scheduling type, the chunk sizes would be as follows:
250 188 141 106 79 59 45 33 25 19 14 11 8 6 4 3 3 2 1 1 1 1
The iterations would then be divided into the following chunks:
chunk  1 = iterations    1 to  250
chunk  2 = iterations  251 to  438
chunk  3 = iterations  439 to  579
chunk  4 = iterations  580 to  685
chunk  5 = iterations  686 to  764
chunk  6 = iterations  765 to  823
chunk  7 = iterations  824 to  868
chunk  8 = iterations  869 to  901
chunk  9 = iterations  902 to  926
chunk 10 = iterations  927 to  945
chunk 11 = iterations  946 to  959
chunk 12 = iterations  960 to  970
chunk 13 = iterations  971 to  978
chunk 14 = iterations  979 to  984
chunk 15 = iterations  985 to  988
chunk 16 = iterations  989 to  991
chunk 17 = iterations  992 to  994
chunk 18 = iterations  995 to  996
chunk 19 = iterations  997 to  997
chunk 20 = iterations  998 to  998
chunk 21 = iterations  999 to  999
chunk 22 = iterations 1000 to 1000
A possible scenario for the division of work could be:
thread 1 executes chunks 1 5 10 13 18 20
thread 2 executes chunks 2 7  9 14 16 22
thread 3 executes chunks 3 6 12 15 19
thread 4 executes chunks 4 8 11 17 21

Example 2. Given the following information:

number of iterations = 100
number of threads = 4
and using the AFFINITY scheduling type, the iterations would be divided into the following partitions:
partition 1 = iterations  1 to  25
partition 2 = iterations 26 to  50
partition 3 = iterations 51 to  75
partition 4 = iterations 76 to 100
The partitions would be divided into the following chunks:
chunk 1a = iterations   1 to  13
chunk 1b = iterations  14 to  19
chunk 1c = iterations  20 to  22
chunk 1d = iterations  23 to  24
chunk 1e = iterations  25 to  25
 
chunk 2a = iterations  26 to  38
chunk 2b = iterations  39 to  44
chunk 2c = iterations  45 to  47
chunk 2d = iterations  48 to  49
chunk 2e = iterations  50 to  50
 
chunk 3a = iterations  51 to  63
chunk 3b = iterations  64 to  69
chunk 3c = iterations  70 to  72
chunk 3d = iterations  73 to  74
chunk 3e = iterations  75 to  75
 
chunk 4a = iterations  76 to  88
chunk 4b = iterations  89 to  94
chunk 4c = iterations  95 to  97
chunk 4d = iterations  98 to  99
chunk 4e = iterations 100 to 100
A possible scenario for the division of work could be:
thread 1 executes chunks 1a 1b 1c 1d 1e 4d
thread 2 executes chunks 2a 2b 2c 2d
thread 3 executes chunks 3a 3b 3c 3d 3e 2e
thread 4 executes chunks 4a 4b 4c 4e
Note that in this scenario, thread 1 finished executing all the chunks in its partition and then grabbed an available chunk from the partition of thread 4. Similarly, thread 3 finished executing all the chunks in its partition and then grabbed an available chunk from the partition of thread 2.

Example 3. Given the following information:

number of iterations = 1000
number of threads = 4
and using the DYNAMIC scheduling type and chunk size of 100, the chunk sizes would be as follows:
100 100 100 100 100 100 100 100 100 100
The iterations would be divided into the following chunks:
chunk  1 = iterations   1 to  100
chunk  2 = iterations 101 to  200
chunk  3 = iterations 201 to  300
chunk  4 = iterations 301 to  400
chunk  5 = iterations 401 to  500
chunk  6 = iterations 501 to  600
chunk  7 = iterations 601 to  700
chunk  8 = iterations 701 to  800
chunk  9 = iterations 801 to  900
chunk 10 = iterations 901 to 1000
A possible scenario for the division of work could be:
thread 1 executes chunks 1  5  9
thread 2 executes chunks 2  8
thread 3 executes chunks 3  6  10
thread 4 executes chunks 4  7

Example 4. Given the following information:

number of iterations = 100
number of threads = 4
and using the STATIC scheduling type, the iterations would be divided into the following chunks:
chunk 1 = iterations  1 to  25
chunk 2 = iterations 26 to  50
chunk 3 = iterations 51 to  75
chunk 4 = iterations 76 to 100
A possible scenario for the division of work could be:
thread 1 executes chunks 1
thread 2 executes chunks 2
thread 3 executes chunks 3
thread 4 executes chunks 4

Related Information

SOURCEFORM

Purpose

The SOURCEFORM compiler directive indicates that all subsequent lines are to be processed in the specified source form until the end of the file is reached or until an @PROCESS directive or another SOURCEFORM directive specifies a different source form.

Format



>>-SOURCEFORM--(--source--)------------------------------------><
 

source
is one of the following: FIXED, FIXED(right_margin), FREE(F90), FREE(IBM), or FREE. FREE defaults to FREE(F90).

right_margin
is an unsigned integer specifying the column position of the right margin. The default is 72. The maximum is 132.

Rules

The SOURCEFORM directive can appear anywhere within a file. An include file is compiled with the source form of the including file. If the SOURCEFORM directive appears in an include file, the source form reverts to that of the including file once processing of the include file is complete.

The SOURCEFORM directive cannot specify a label.
Tip

To modify your existing files to Fortran 90 free source form where include files exist:

  1. Convert your include files to Fortran 90 free source form: add a SOURCEFORM directive to the top of each include file. For example:
    !CONVERT* SOURCEFORM (FREE(F90))
    
    Define your own trigger_constant for this conversion process.

  2. Once all the include files are converted, convert the .f files. Add the same SOURCEFORM directive to the top of each file, or ensure the .f file is compiled with -qfree=f90.

  3. Once all files have been converted, you can disable the processing of the directives with the -qnodirective compiler option. Ensure that -qfree=f90 is used at compile time. You may also delete any unnecessary SOURCEFORM directives.

Examples

@PROCESS DIRECTIVE(CONVERT*)
      PROGRAM MAIN          ! Main program not yet converted
      A=1; B=2
      INCLUDE 'freeform.f'
      PRINT *, RESULT       ! Reverts to fixed form
      END

where file freeform.f contains:

!CONVERT* SOURCEFORM(FREE(F90))
RESULT = A + B

THREADLOCAL

Purpose

The THREADLOCAL directive is used to declare thread-specific common data. It is a possible method of ensuring that access to data contained within COMMON blocks is serialized.

In order to make use of this directive it is not necessary to specify the -qsmp compiler option, but the invocation command must be xlf_r or xlf90_r to link the necessary libraries.

Format



                         +-,------------------------+
                         V                          |
>>-THREADLOCAL--+-----+----/--common_block_name--/--+----------><
                +-::--+
 

Rules

Only named common blocks may be declared as THREADLOCAL. All rules and constraints that normally apply to named common blocks apply to common blocks declared as THREADLOCAL. See COMMON for more information on the rules and constraints that apply to named common blocks.

The THREADLOCAL directive must appear in the specification_part of the scoping unit. If a common block appears in a THREADLOCAL directive, it must also be declared within a COMMON statement in the same scoping unit. The THREADLOCAL directive may occur before or after the COMMON statement. See "Main Program" for more information on the specification_part of the scoping unit.

A common block cannot be given the THREADLOCAL attribute if it is declared within a PURE subprogram.

Members of a THREADLOCAL common block must not appear in NAMELIST statements.

A common block which is use-associated must not be declared as THREADLOCAL in the scoping unit that contains the USE statement.

Any pointers declared in a THREADLOCAL common block are not affected by the -qinit=f90ptr compiler option.

Objects within THREADLOCAL common blocks may be used in parallel loops and parallel sections. However, these objects are implicitly shared across the iterations of the loop, and across code blocks within parallel sections. In other words, within a scoping unit, all accessible common blocks, whether declared as THREADLOCAL or not, have the SHARED attribute within parallel loops and sections in that scoping unit.

If a common block is declared as THREADLOCAL within a scoping unit, any subprogram that declares or references the common block, and that is directly or indirectly referenced by the scoping unit, must be executed by the same thread executing the scoping unit. If two procedures that declare common blocks are executed by different threads, then they would obtain different copies of the common block, provided that the common block had been declared THREADLOCAL. Threads can be created in one of the following ways:

If a common block is declared to be THREADLOCAL in one scoping unit, it must be declared to be THREADLOCAL in every scoping unit that declares the common block.

If a THREADLOCAL common block, that does not have the SAVE attribute, is declared within a subprogram, the members of the block become undefined at subprogram RETURN or END unless there is at least one other scoping unit in which the common block is accessible that is making a direct or indirect reference to the subprogram.

Examples

Example 1: The following procedure "FORT_SUB" is invoked by two threads:

SUBROUTINE FORT_SUB(IARG)
  INTEGER IARG
 
  CALL LIBRARY_ROUTINE1()
  CALL LIBRARY_ROUTINE2()
  ...
END SUBROUTINE FORT_SUB
SUBROUTINE LIBRARY_ROUTINE1()
  COMMON /BLOCK/ R               ! The SAVE attribute is required for the common
  SAVE /BLOCK/                   ! block because the program requires that the block
  !IBM* THREADLOCAL /BLOCK/      ! remain defined after library_routine1 is invoked.
 
  R = 1.0
    ...
END SUBROUTINE LIBRARY_ROUTINE1
SUBROUTINE LIBRARY_ROUTINE2()
  COMMON /BLOCK/ R
  SAVE /BLOCK/
  !IBM* THREADLOCAL /BLOCK/
 
  ... = R
  ...
END SUBROUTINE LIBRARY_ROUTINE2

Example 2: "FORT_SUB" is invoked by multiple threads. This is an invalid example because "FORT_SUB" and "ANOTHER_SUB" both declare /BLOCK/ to be THREADLOCAL. They intend to share the common block, but they are executed by different threads.

SUBROUTINE FORT_SUB()
  COMMON /BLOCK/ J
  INTEGER :: J
  !IBM* THREADLOCAL /BLOCK/        ! Each thread executing FORT_SUB
                                   ! obtains its own copy of /BLOCK/
  INTEGER A(10)
 
  ...
  !IBM* INDEPENDENT
  DO INDEX = 1,10
    CALL ANOTHER_SUB(A(I))
  END DO
  ...
 
END SUBROUTINE FORT_SUB
SUBROUTINE ANOTHER_SUB(AA)         ! Multiple threads are used to execute ANOTHER_SUB
  INTEGER AA
  COMMON /BLOCK/ J                 ! Each thread obtains a new copy of the
  INTEGER :: J                     !  common block /BLOCK/
  !IBM* THREADLOCAL /BLOCK/
  ...
  AA = J                           ! The value of 'J' is undefined.
END SUBROUTINE ANOTHER_SUB

Related Information


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]