Write up on tech geek history: Fortran, the DIMENSION statement

Introduction:

Overview

Among the additions to FORTRAN 77 in this International Standard, seven stand out as the major ones:

(1) Array operations

(2) Improved facilities for numerical computation

(3) Parameterized intrinsic data types

(4) User-defined data types

(5) Facilities for modular data and procedure definitions

(6) Pointers

(7) The concept of language evolution A number of other additions are also included in this International Standard, such as improved source form facilities, more control constructs, recursion, additional input/output facilities, and dynamically allocatable arrays. Array operations Computation involving large arrays is an important part of engineering and scientific computing. Arrays may be used as entities in Fortran. Operations for processing whole arrays and subarrays (array sections) are included in the language for two principal reasons:

(1) these features provide a more concise and higher level language that will allow programmers more quickly and reliably to develop and maintain scientific/engineering applications, and

(2) these features can significantly facilitate optimization of array operations on many computer architectures. The FORTRAN 77 arithmetic, logical, and character operations and intrinsic (predefined) functions are extended to operate on array-valued operands. The array extensions include whole, partial, and masked array assignment, array-valued constants and expressions, and facilities to define user-supplied array-valued functions. New intrinsic procedures are provided to manipulate and construct arrays, to perform gather/scatter operations, and to support extended computational capabilities involving arrays. For example, an intrinsic function is provided to sum the elements of an array.

Parameterized character data type Optional facilities for multibyte character data for languages with large character sets, such as those in China and Japan, are added by using a kind parameter for the character data type. This facility allows additional character sets for special purposes as well, such as characters for mathematics, chemistry, or music. Derived types “Derived type” is the term given to that set of features in this International Standard that allows the programmer to define arbitrary data structures and operations on them. Data structures are user-defined aggregations of intrinsic and derived data types. Intrinsic uses of structured objects include assignment, input/output, and as procedure arguments. With no additional derived-type operations defined by the user, the derived data type facility is a simple data structuring mechanism. With additional operation definitions, derived types provide an effective implementation mechanism for data abstractions.

Procedure definitions may be used to define operations on intrinsic or derived types and non intrinsic assignments for intrinsic and derived types. Modular definitions In FORTRAN 77, there was no way to define a global data area in only one place and have all the program units in an application use that definition.

In addition, the ENTRY statement is awkward and restrictive for implementing a related set of procedures, possibly involving common data objects. Finally, there was no means in FORTRAN 77 by which procedure definitions, especially interface information, could be made known locally to a program unit. These and other deficiencies are remedied by a new type of program unit that may contain any combination of data object declarations, derived-type definitions, procedure definitions, and procedure interface information.

This program unit, called a module, may be considered to be a generalization and replacement for the block data program unit. A module may be accessed by any program unit, thereby making the module contents available to that program unit. Thus, modules provide improved facilities for defining global data areas, procedure packages, and encapsulated data abstractions.

Pointers

Pointers allow arrays to be sized dynamically and ranged, and structures to be linked to create lists, trees, and graphs. An object of any intrinsic or derived type may be declared to have the pointer attribute. Once such an object becomes associated with a target, it may appear almost anywhere a nonpointer object with the same type, type parameters, and shape may appear.

Language evolution With the addition of new facilities, certain old features become redundant and may eventually be phased out of the language as their usage declines. For example, the numeric facilities alluded to above provide the functionality of double precision; with the new array facilities, nonconformable argument association (such as associating an array element with a dummy array) is unnecessary (and in fact is not useful as an array operation); and block data program units are redundant and inferior to modules.

Significance of Study

Co -Array (Dimension Statements)

A Co-Array Fortran program executes as if it were replicated a number of times, the number of replications remaining fixed during execution of the program. Each copy is called an image and each image executes asynchronously. A particular implementation of Co-Array Fortran may permit the number of images to be chosen at compile time, at link time, or at execute time. The number of images may be the same as the number of physical processors, or it may be more, or it may be less. The programmer may retrieve the number of images at run time by invoking the intrinsic function nurc~images ( ).

Images are indexed starting from one and the programmer may retrieve the index of the invoking image through the intrinsic function this image ( ). The programmer controls the execution sequence in each image through explicit use of Fortran 95 control constructs and through explicit use of an intrinsic synchronization procedure

1.2 Specifying data objects Each image has its own set of data objects, all of which may be accessed in the normal Fortran way. Some objects are declared with co-dimensions in square brackets immediately following dimensions in parentheses or in place of them, for example:

real, dimension(20) [20,*] :: a real :: c[*], d[*] character :: b(20) [20,0:*] integer :: ib(lO) [*] type (interval) : : s dimension :: s[20,*]

Unless the array is allocatable (Section 3.6), the form for the dimensions in square brackets is the same as that for the dimensions in parentheses for an assumed-size array. The set of objects on all the images is itself an array, called a co-array, which can be addressed with array syntax using subscripts in square brackets following any subscripts in parentheses (round brackets),

for example: a(5) [3,7] = ib(5) [3] d[3] = c a(:) [2,3] : c[l]

We call any object whose designator includes square brackets a co-array subobject; it may be a co-array element, a co-array section, or a co-array structure component. The subscripts in square brackets are mapped to images in the same way as Fortran array subscripts in parentheses are mapped to memory locations in a Fortran 95 program.

The subscripts within an array that correspond to data for the current image are available from the intrinsic this_image with the co-array name as its argument. Note: On a shared-memory machine, we expect a co-array to be implemented as if it were an array of higher rank. The implementation would need to support the declaration of arrays of rank up to 14. On a distributed-memory machine with one physical processor for each image, a co-array may be stored from the same memory address in each physical processor. On any machine, a co-array may be implemented in such a way that each image can calculate the memory address of an element on any other image. The rank, extents, size, and shape of a co-array or co-array subobject are given as for Fortran 95 except that we include both the data in parentheses and the data in square brackets. The local rank, local extents, local size, and local shape are given by ignoring the data in square brackets. The co-rank, co-extents, co-size, and co-shape are given from the data in square brackets. For example, given the co-array declared thu

l, dimension(10,20) [20,5,*] :: a a ( : , : ) [ : , : , 1 : 15] has rank 5, local rank 2, co-rank 3, shape (/I0,20,20,5,15/), local shape (/10,20/), and co-shape (/20,5,15/).

The co-size of a co-array is always equal to the number of images. If the co-rank is one, the co-array has a co-extent equal to the number of images and it has co-shape (/num images ( ) / ). If the co-rank is greater than one, the co-array has no final extent, no final upper bound, and no co-shape (and hence no shape). Note: We considered defining the final extent when the co-rank is greater than one as the number of images divided by the product of the other extents, truncating towards zero. We reject this, since it means, for example, that a ( : , : ) [ : , : , : ] would not always refer to the whole declared co-array. The local rank and the co-rank are each limited to seven. The syntax automatically ensures that these are the same on all images.

The rank of a co-array subobject (sum of local rank and co-rank) must not exceed seven. Note: The reason for the limit of seven on the rank of a co-array subobject is that we expect early implementations to make a temporary local copy of the array and then rely on ordinary Fortran 95 mechanisms. For a co-array subobject, square brackets may never precede parentheses. Note: For clarity, we recommend that subscripts in parentheses are employed whenever the parent has nonzero local rank.

For example, a [ : ] is not as clear as a ( : ) [ : ]. A co-array must have the same bounds (and hence the same extents) on all images. For example, the subroutine subroutine solve(n,a,b) integer : : n real :: a(n)[*], b(n) must not be called on one image with n having the value 1000 and on another with n having the value 1001.

A co-array may be allocatable: subroutine solve(n,a,b) integer : : n real :: a(n) [*], b(n) real, allocatable : : work(:) [:] Allocatable arrays are discussed in Section 3.6. There is no mechanism for assumed-co-shape arrays (but see Appendix 1, which describes a possible extension).

A co-array is not permitted to be a pointer (but see Appendix 2, which describes another possible extension). Automatic co-arrays are not permitted; for example, the co-array work in the above code fragment is not permitted to be declared thus subroutine solve(n,a,b) integer :: n real :: a(n) [*], b(n) real :: work(n)[*] !

Not permitted Note: Were automatic co-arrays permitted, for example, in a future revision of the language, they would pose problems to implementations over consistent memory addressing among images. It would probably be necessary to require image synchronization, both before and after memory is allocated on entry and both

before and after memory is deallocated on return. A co-array is not permitted to be a constant. Note: This restriction is not necessary, but the feature would be useless since each image would hold exactly 9 the same value. We see no point in insisting that vendors implement such a feature. A DATA statement initializes only local data. Therefore, co-array subobjects are not permitted in DATA statements. For example: real :: a(10) [*] data a(1) /0.0/ !

Permitted data a(1)[2] /0.0/ !

Not permitted Unless it is allocatable or a dummy argument, a co-array always has the SAVE attribute. The image indices of a co-array always form a sequence, without any gaps, commencing at one. This is true for any lower bounds. For example, for the an’ay declared as real :: a(10,20) [20,0:5,*] a(: [i,0,i] refers to the rank-two arraya(:, in image one. Note: If a large array is needed on a subset of images, it is wasteful of memory to specify it directly as a co-array. Instead, it should be specified as a pointer component of a co-array and allocated only on the images on which it is needed (we expect to use an allocatable component in Fortran 2000). Co-arrays may be of derived type but components of derived types are not

permitted to be co-arrays. Note: Were we to allow co-array components, we would be confronted with references such as z [p] %x [ q]. A logical way to read such an expression would be: go to image p and find component x on image q

https://web.cs.ucla.edu/~palsberg/course/cs239/papers/numrich.pdf

Co-arrays were designed to answer the question ‘What is the smallest change required to convert Fortran into a robust and efficient parallel language?’. Our answer is a simple syntactic extension. It looks and feels like Fortran and requires Fortran programmers to learn only a few new rules. These rules are related to two fundamental issues that any parallel programming model must resolve, work distribution and data distribution. First, consider work distribution. The co-array extension adopts the Single-Program-MultipleData (SPMD) programming model. A single program is replicated a fixed number of times, each replication having its own set of data objects.

Each replication of the program is called an image. The number of images may be the same as the number of physical processors, or it may be more, or it may be less. A particular implementation may permit the number of images to be chosen at compile time, at link time, or at execute time. Each image executes asynchronously and the normal rules of Fortran apply. The execution sequence may differ from image to image as specified by the programmer who, with the help of a unique image index, determines the actual path using normal Fortran control constructs and explicit synchronizations. For code between synchronizations, the compiler is free to use all its normal optimization techniques as if only one image were present. At least in early implementations, it is expected that each image will execute the same executable code (.o or .exe file) on identical hardware.

Second, consider data distribution. The co-array extension allows the programmer to express data distribution by specifying the relationship among memory images in a syntax very much like normal Fortran array syntax. Objects with the new syntax have an important property: as well as having access to the local object, each image may access the corresponding object on any other image. For example, the statement real, dimension(1000)[*] :: x,y declares two objects x and y, each as a co-array.

A co-array always has the same shape on each image. In this example, each image has two real co-arrays of size 1000. If an image executes the statement: x(:) = y(:)[q] the co-array y on image q is copied into co-array x on the executing image. Array indices in parentheses follow the normal Fortran rules within one image.

Co-array indices in square brackets provide an equally convenient notation for accessing an object on another image. Bounds in square brackets in co-array declarations follow the rules of assumed-size arrays since a co-array always exists on all the images. The upper bound for the last co-dimension is never specified, which allows the programmer to write code without knowing the number of images the code will eventually use. The programmer uses co-array syntax only where it is needed. A reference to a co-array with no square brackets attached to it is a reference to the object in the memory of the executing

On a shared-memory machine, a co-array on an image and the corresponding co-arrays on other images may be implemented as a sequence of arrays with evenly spaced starting addresses. On a distributed-memory machine with one physical processor for each image, a co-array may be stored from the same virtual address in each physical processor. On any machine, a co-array may be implemented in such a way that each image can calculate the virtual address of an element on another image relative to the array start address on that other image. An implementation might arrange for each co-array to be stored from the same virtual address in each image, but this is not required. Because co-arrays are integrated into the language, remote references automatically gain the services of Fortran’s basic data capabilities, including the typing system and automatic type conversions in assignments, information about structure layout, and even object-oriented features. The co-array feature adopted by WG5 was formerly known as Co-Array Fortran, an informal extension to Fortran 95 by Numrich and Reid (1998).

Co-Array Fortran itself was formerly known as F −−, which evolved from a simple programming model for the CRAY-T3D described only in internal Technical Reports at Cray Research in the early 1990s. The first informal definition (Numrich 1997) was restricted to the Fortran 77 language and used a different syntax to represent co-arrays. It was extended to Fortran 90 by Numrich and Steidel (1997) and defined more precisely for Fortran 95 by Numrich and Reid (1998). Portions of Co-Array Fortran have been incorporated into the Cray Fortran compiler and various applications have been converted to the syntax (see, for example, Numrich, Reid, and Kim 1998, Numrich 2005a, and Numrich 2005b).

A portable compiling system for a subset of the extension has been implemented by Dotsenko, Coarfa, and Mellor-Crummey (2004). It is called cafc and performs source-to-source transformation of co-array code to Fortran 90 augmented with communication operations. One instantiation uses the Aggregate Remote Memory Copy Interface (ARMCI) library for one-sided communication (Nieplocha and Carpenter 1999) and another uses GASNet (Bonachea 2002). Experience with the use of cafc is related by Coarfa, Dotsenko, and Mellor-Crummey, Cantonnet, El-Ghazawi, Mohanti, Yao, and Chavarr´ıa-Miranda (2005).

They found that on several platforms cafc gave performance comparable with MPI on the NAS MG, CG, SP, and BT parallel benchmarks (Bailey, Harris, Saphir, van der Wijngaart, Woo, and Yarrow 1995). Reid (2005) proposed that co-arrays be included in the revision of Fortran that is planned for 2008 (which we will call Fortran 2008 in this report). The ISO Fortran Committee agreed to include co-arrays in May 2005, but made some changes and further changes have been made since then by J3, the Primary Development Body for Fortran. The rest of this article contains a complete description of the proposal as it now stands. For mor

https://wg5-fortran.org/N1701-N1750/N1708.pdf

Dimension Statements

~When~

Vera subscripted variable appears, in ,a, FORTRAN program, it is necessary to include a statement which indicates the Size of the array referred to by this variable

• ‘DIMENSION statement permitted 704::to, assign ‘the proper number of storage locations to each subscripted :variable.

The DIMENSION., statement consists of the name of reach subscripted variable’ followed by an integer” in parentheses which represents the greatest number of elements which will ever be included in the “array. The variables are separated By commas add the whole group of 8” .is preceded by the word DIMENSION.

If, the Subscripted’ variable is , ALPHA(n), ,GAMMA(.J), and VECTOR(N) had, appearred in a FOLTRAN Program , ,;.DIMENSION statement mentioning these var tables ,would have to be included. Assuming that the~ number of elements ln ALPHA{l) will never exceed 100, the number in GAMMA(.J)~l1 never exceed ,’Z’5, ,and in’ VECTOR(N) will never exceed ,1Z. then the Dimension statement would be written, as DIMENSION .. ALP.HA(100), .GAMMA(Z5) ,VECTOR (IZ). DlMENSl0N Statements are not actually executed. No instructions corresponding to this statement will appear in the translated

program.

However, a DIMENSION statement giving the~ files of each array should precede the first executable statement mentioning that array. A single DIMENSION statement, including all subscripted variables mentioned in the program, may be used or separate statements may be inserted prior to mentioning each new array,

https://www.softwarepreservation.org/projects/FORTRAN/manual/Intro-Section_II.pdf

The DIMENSION statement provides the information necessary to allocate storage in the object program for arrays. Each variable which appears in subscripted form in a program or subprogram must appear in a DIMENSION statement of that program or subprogram; the DIMENSION statement must precede the first appearance of that variable.

The DIMENSION statement lists the maximum dimensions of arrays; in the object program references to these arrays must never exceed the specified dimensions. The above example indicates that B is a 2-dimensional array for which the subscripts never exceed 5 and 15. The DIMENSION statement therefore, causes 75 (i. e., 5 x 15) storage locations to be set aside for the array B.

A single DIMENSION statement may specify the dimensions of any number of arrays. A program must not contain a DIMENSION statement which includes the name of the program itself, or any program which it calls.

https://archive.computerhistory.org/resources/text/Fortran/102663112.05.01.acc.pdf

References

https://web.cs.ucla.edu/~palsberg/course/cs239/papers/numrich.pdf

https://wg5-fortran.org/N1701-N1750/N1708.pdf

https://www.softwarepreservation.org/projects/FORTRAN/manual/Intro-Section_II.pdf

https://archive.computerhistory.org/resources/text/Fortran/102663112.05.01.acc.pdf

Write up on tech geek history: Fortran, the DIMENSION statement

Leave a Comment Cancel Reply