|  | <!--===- docs/FortranForCProgrammers.md | 
|  |  | 
|  | Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | 
|  | See https://llvm.org/LICENSE.txt for license information. | 
|  | SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | 
|  |  | 
|  | --> | 
|  |  | 
|  | # Fortran For C Programmers | 
|  |  | 
|  | ```eval_rst | 
|  | .. contents:: | 
|  | :local: | 
|  | ``` | 
|  |  | 
|  | This note is limited to essential information about Fortran so that | 
|  | a C or C++ programmer can get started more quickly with the language, | 
|  | at least as a reader, and avoid some common pitfalls when starting | 
|  | to write or modify Fortran code. | 
|  | Please see other sources to learn about Fortran's rich history, | 
|  | current applications, and modern best practices in new code. | 
|  |  | 
|  | ## Know This At Least | 
|  |  | 
|  | * There have been many implementations of Fortran, often from competing | 
|  | vendors, and the standard language has been defined by U.S. and | 
|  | international standards organizations.  The various editions of | 
|  | the standard are known as the '66, '77, '90, '95, 2003, 2008, and | 
|  | (now) 2018 standards. | 
|  | * Forward compatibility is important.  Fortran has outlasted many | 
|  | generations of computer systems hardware and software.  Standard | 
|  | compliance notwithstanding, Fortran programmers generally expect that | 
|  | code that has compiled successfully in the past will continue to | 
|  | compile and work indefinitely.  The standards sometimes designate | 
|  | features as being deprecated, obsolescent, or even deleted, but that | 
|  | can be read only as discouraging their use in new code -- they'll | 
|  | probably always work in any serious implementation. | 
|  | * Fortran has two source forms, which are typically distinguished by | 
|  | filename suffixes.  `foo.f` is old-style "fixed-form" source, and | 
|  | `foo.f90` is new-style "free-form" source.  All language features | 
|  | are available in both source forms.  Neither form has reserved words | 
|  | in the sense that C does.  Spaces are not required between tokens | 
|  | in fixed form, and case is not significant in either form. | 
|  | * Variable declarations are optional by default.  Variables whose | 
|  | names begin with the letters `I` through `N` are implicitly | 
|  | `INTEGER`, and others are implicitly `REAL`.  These implicit typing | 
|  | rules can be changed in the source. | 
|  | * Fortran uses parentheses in both array references and function calls. | 
|  | All arrays must be declared as such; other names followed by parenthesized | 
|  | expressions are assumed to be function calls. | 
|  | * Fortran has a _lot_ of built-in "intrinsic" functions.  They are always | 
|  | available without a need to declare or import them.  Their names reflect | 
|  | the implicit typing rules, so you will encounter names that have been | 
|  | modified so that they have the right type (e.g., `AIMAG` has a leading `A` | 
|  | so that it's `REAL` rather than `INTEGER`). | 
|  | * The modern language has means for declaring types, data, and subprogram | 
|  | interfaces in compiled "modules", as well as legacy mechanisms for | 
|  | sharing data and interconnecting subprograms. | 
|  |  | 
|  | ## A Rosetta Stone | 
|  |  | 
|  | Fortran's language standard and other documentation uses some terminology | 
|  | in particular ways that might be unfamiliar. | 
|  |  | 
|  | | Fortran | English | | 
|  | | ------- | ------- | | 
|  | | Association | Making a name refer to something else | | 
|  | | Assumed | Some attribute of an argument or interface that is not known until a call is made | | 
|  | | Companion processor | A C compiler | | 
|  | | Component | Class member | | 
|  | | Deferred | Some attribute of a variable that is not known until an allocation or assignment | | 
|  | | Derived type | C++ class | | 
|  | | Dummy argument | C++ reference argument | | 
|  | | Final procedure | C++ destructor | | 
|  | | Generic | Overloaded function, resolved by actual arguments | | 
|  | | Host procedure | The subprogram that contains a nested one | | 
|  | | Implied DO | There's a loop inside a statement | | 
|  | | Interface | Prototype | | 
|  | | Internal I/O | `sscanf` and `snprintf` | | 
|  | | Intrinsic | Built-in type or function | | 
|  | | Polymorphic | Dynamically typed | | 
|  | | Processor | Fortran compiler | | 
|  | | Rank | Number of dimensions that an array has | | 
|  | | `SAVE` attribute | Statically allocated | | 
|  | | Type-bound procedure | Kind of a C++ member function but not really | | 
|  | | Unformatted | Raw binary | | 
|  |  | 
|  | ## Data Types | 
|  |  | 
|  | There are five built-in ("intrinsic") types: `INTEGER`, `REAL`, `COMPLEX`, | 
|  | `LOGICAL`, and `CHARACTER`. | 
|  | They are parameterized with "kind" values, which should be treated as | 
|  | non-portable integer codes, although in practice today these are the | 
|  | byte sizes of the data. | 
|  | (For `COMPLEX`, the kind type parameter value is the byte size of one of the | 
|  | two `REAL` components, or half of the total size.) | 
|  | The legacy `DOUBLE PRECISION` intrinsic type is an alias for a kind of `REAL` | 
|  | that should be more precise, and bigger, than the default `REAL`. | 
|  |  | 
|  | `COMPLEX` is a simple structure that comprises two `REAL` components. | 
|  |  | 
|  | `CHARACTER` data also have length, which may or may not be known at compilation | 
|  | time. | 
|  | `CHARACTER` variables are fixed-length strings and they get padded out | 
|  | with space characters when not completely assigned. | 
|  |  | 
|  | User-defined ("derived") data types can be synthesized from the intrinsic | 
|  | types and from previously-defined user types, much like a C `struct`. | 
|  | Derived types can be parameterized with integer values that either have | 
|  | to be constant at compilation time ("kind" parameters) or deferred to | 
|  | execution ("len" parameters). | 
|  |  | 
|  | Derived types can inherit ("extend") from at most one other derived type. | 
|  | They can have user-defined destructors (`FINAL` procedures). | 
|  | They can specify default initial values for their components. | 
|  | With some work, one can also specify a general constructor function, | 
|  | since Fortran allows a generic interface to have the same name as that | 
|  | of a derived type. | 
|  |  | 
|  | Last, there are "typeless" binary constants that can be used in a few | 
|  | situations, like static data initialization or immediate conversion, | 
|  | where type is not necessary. | 
|  |  | 
|  | ## Arrays | 
|  |  | 
|  | Arrays are not types in Fortran. | 
|  | Being an array is a property of an object or function, not of a type. | 
|  | Unlike C, one cannot have an array of arrays or an array of pointers, | 
|  | although can can have an array of a derived type that has arrays or | 
|  | pointers as components. | 
|  | Arrays are multidimensional, and the number of dimensions is called | 
|  | the _rank_ of the array. | 
|  | In storage, arrays are stored such that the last subscript has the | 
|  | largest stride in memory, e.g. A(1,1) is followed by A(2,1), not A(1,2). | 
|  | And yes, the default lower bound on each dimension is 1, not 0. | 
|  |  | 
|  | Expressions can manipulate arrays as multidimensional values, and | 
|  | the compiler will create the necessary loops. | 
|  |  | 
|  | ## Allocatables | 
|  |  | 
|  | Modern Fortran programs use `ALLOCATABLE` data extensively. | 
|  | Such variables and derived type components are allocated dynamically. | 
|  | They are automatically deallocated when they go out of scope, much | 
|  | like C++'s `std::vector<>` class template instances are. | 
|  | The array bounds, derived type `LEN` parameters, and even the | 
|  | type of an allocatable can all be deferred to run time. | 
|  | (If you really want to learn all about modern Fortran, I suggest | 
|  | that you study everything that can be done with `ALLOCATABLE` data, | 
|  | and follow up all the references that are made in the documentation | 
|  | from the description of `ALLOCATABLE` to other topics; it's a feature | 
|  | that interacts with much of the rest of the language.) | 
|  |  | 
|  | ## I/O | 
|  |  | 
|  | Fortran's input/output features are built into the syntax of the language, | 
|  | rather than being defined by library interfaces as in C and C++. | 
|  | There are means for raw binary I/O and for "formatted" transfers to | 
|  | character representations. | 
|  | There are means for random-access I/O using fixed-size records as well as for | 
|  | sequential I/O. | 
|  | One can scan data from or format data into `CHARACTER` variables via | 
|  | "internal" formatted I/O. | 
|  | I/O from and to files uses a scheme of integer "unit" numbers that is | 
|  | similar to the open file descriptors of UNIX; i.e., one opens a file | 
|  | and assigns it a unit number, then uses that unit number in subsequent | 
|  | `READ` and `WRITE` statements. | 
|  |  | 
|  | Formatted I/O relies on format specifications to map values to fields of | 
|  | characters, similar to the format strings used with C's `printf` family | 
|  | of standard library functions. | 
|  | These format specifications can appear in `FORMAT` statements and | 
|  | be referenced by their labels, in character literals directly in I/O | 
|  | statements, or in character variables. | 
|  |  | 
|  | One can also use compiler-generated formatting in "list-directed" I/O, | 
|  | in which the compiler derives reasonable default formats based on | 
|  | data types. | 
|  |  | 
|  | ## Subprograms | 
|  |  | 
|  | Fortran has both `FUNCTION` and `SUBROUTINE` subprograms. | 
|  | They share the same name space, but functions cannot be called as | 
|  | subroutines or vice versa. | 
|  | Subroutines are called with the `CALL` statement, while functions are | 
|  | invoked with function references in expressions. | 
|  |  | 
|  | There is one level of subprogram nesting. | 
|  | A function, subroutine, or main program can have functions and subroutines | 
|  | nested within it, but these "internal" procedures cannot themselves have | 
|  | their own internal procedures. | 
|  | As is the case with C++ lambda expressions, internal procedures can | 
|  | reference names from their host subprograms. | 
|  |  | 
|  | ## Modules | 
|  |  | 
|  | Modern Fortran has good support for separate compilation and namespace | 
|  | management. | 
|  | The *module* is the basic unit of compilation, although independent | 
|  | subprograms still exist, of course, as well as the main program. | 
|  | Modules define types, constants, interfaces, and nested | 
|  | subprograms. | 
|  |  | 
|  | Objects from a module are made available for use in other compilation | 
|  | units via the `USE` statement, which has options for limiting the objects | 
|  | that are made available as well as for renaming them. | 
|  | All references to objects in modules are done with direct names or | 
|  | aliases that have been added to the local scope, as Fortran has no means | 
|  | of qualifying references with module names. | 
|  |  | 
|  | ## Arguments | 
|  |  | 
|  | Functions and subroutines have "dummy" arguments that are dynamically | 
|  | associated with actual arguments during calls. | 
|  | Essentially, all argument passing in Fortran is by reference, not value. | 
|  | One may restrict access to argument data by declaring that dummy | 
|  | arguments have `INTENT(IN)`, but that corresponds to the use of | 
|  | a `const` reference in C++ and does not imply that the data are | 
|  | copied; use `VALUE` for that. | 
|  |  | 
|  | When it is not possible to pass a reference to an object, or a sparse | 
|  | regular array section of an object, as an actual argument, Fortran | 
|  | compilers must allocate temporary space to hold the actual argument | 
|  | across the call. | 
|  | This is always guaranteed to happen when an actual argument is enclosed | 
|  | in parentheses. | 
|  |  | 
|  | The compiler is free to assume that any aliasing between dummy arguments | 
|  | and other data is safe. | 
|  | In other words, if some object can be written to under one name, it's | 
|  | never going to be read or written using some other name in that same | 
|  | scope. | 
|  | ``` | 
|  | SUBROUTINE FOO(X,Y,Z) | 
|  | X = 3.14159 | 
|  | Y = 2.1828 | 
|  | Z = 2 * X ! CAN BE FOLDED AT COMPILE TIME | 
|  | END | 
|  | ``` | 
|  | This is the opposite of the assumptions under which a C or C++ compiler must | 
|  | labor when trying to optimize code with pointers. | 
|  |  | 
|  | ## Overloading | 
|  |  | 
|  | Fortran supports a form of overloading via its interface feature. | 
|  | By default, an interface is a means for specifying prototypes for a | 
|  | set of subroutines and functions. | 
|  | But when an interface is named, that name becomes a *generic* name | 
|  | for its specific subprograms, and calls via the generic name are | 
|  | mapped at compile time to one of the specific subprograms based | 
|  | on the types, kinds, and ranks of the actual arguments. | 
|  | A similar feature can be used for generic type-bound procedures. | 
|  |  | 
|  | This feature can be used to overload the built-in operators and some | 
|  | I/O statements, too. | 
|  |  | 
|  | ## Polymorphism | 
|  |  | 
|  | Fortran code can be written to accept data of some derived type or | 
|  | any extension thereof using `CLASS`, deferring the actual type to | 
|  | execution, rather than the usual `TYPE` syntax. | 
|  | This is somewhat similar to the use of `virtual` functions in c++. | 
|  |  | 
|  | Fortran's `SELECT TYPE` construct is used to distinguish between | 
|  | possible specific types dynamically, when necessary.  It's a | 
|  | little like C++17's `std::visit()` on a discriminated union. | 
|  |  | 
|  | ## Pointers | 
|  |  | 
|  | Pointers are objects in Fortran, not data types. | 
|  | Pointers can point to data, arrays, and subprograms. | 
|  | A pointer can only point to data that has the `TARGET` attribute. | 
|  | Outside of the pointer assignment statement (`P=>X`) and some intrinsic | 
|  | functions and cases with pointer dummy arguments, pointers are implicitly | 
|  | dereferenced, and the use of their name is a reference to the data to which | 
|  | they point instead. | 
|  |  | 
|  | Unlike C, a pointer cannot point to a pointer *per se*, nor can they be | 
|  | used to implement a level of indirection to the management structure of | 
|  | an allocatable. | 
|  | If you assign to a Fortran pointer to make it point at another pointer, | 
|  | you are making the pointer point to the data (if any) to which the other | 
|  | pointer points. | 
|  | Similarly, if you assign to a Fortran pointer to make it point to an allocatable, | 
|  | you are making the pointer point to the current content of the allocatable, | 
|  | not to the metadata that manages the allocatable. | 
|  |  | 
|  | Unlike allocatables, pointers do not deallocate their data when they go | 
|  | out of scope. | 
|  |  | 
|  | A legacy feature, "Cray pointers", implements dynamic base addressing of | 
|  | one variable using an address stored in another. | 
|  |  | 
|  | ## Preprocessing | 
|  |  | 
|  | There is no standard preprocessing feature, but every real Fortran implementation | 
|  | has some support for passing Fortran source code through a variant of | 
|  | the standard C source preprocessor. | 
|  | Since Fortran is very different from C at the lexical level (e.g., line | 
|  | continuations, Hollerith literals, no reserved words, fixed form), using | 
|  | a stock modern C preprocessor on Fortran source can be difficult. | 
|  | Preprocessing behavior varies across implementations and one should not depend on | 
|  | much portability. | 
|  | Preprocessing is typically requested by the use of a capitalized filename | 
|  | suffix (e.g., "foo.F90") or a compiler command line option. | 
|  | (Since the F18 compiler always runs its built-in preprocessing stage, | 
|  | no special option or filename suffix is required.) | 
|  |  | 
|  | ## "Object Oriented" Programming | 
|  |  | 
|  | Fortran doesn't have member functions (or subroutines) in the sense | 
|  | that C++ does, in which a function has immediate access to the members | 
|  | of a specific instance of a derived type. | 
|  | But Fortran does have an analog to C++'s `this` via *type-bound | 
|  | procedures*. | 
|  | This is a means of binding a particular subprogram name to a derived | 
|  | type, possibly with aliasing, in such a way that the subprogram can | 
|  | be called as if it were a component of the type (e.g., `X%F(Y)`) | 
|  | and receive the object to the left of the `%` as an additional actual argument, | 
|  | exactly as if the call had been written `F(X,Y)`. | 
|  | The object is passed as the first argument by default, but that can be | 
|  | changed; indeed, the same specific subprogram can be used for multiple | 
|  | type-bound procedures by choosing different dummy arguments to serve as | 
|  | the passed object. | 
|  | The equivalent of a `static` member function is also available by saying | 
|  | that no argument is to be associated with the object via `NOPASS`. | 
|  |  | 
|  | There's a lot more that can be said about type-bound procedures (e.g., how they | 
|  | support overloading) but this should be enough to get you started with | 
|  | the most common usage. | 
|  |  | 
|  | ## Pitfalls | 
|  |  | 
|  | Variable initializers, e.g. `INTEGER :: J=123`, are _static_ initializers! | 
|  | They imply that the variable is stored in static storage, not on the stack, | 
|  | and the initialized value lasts only until the variable is assigned. | 
|  | One must use an assignment statement to implement a dynamic initializer | 
|  | that will apply to every fresh instance of the variable. | 
|  | Be especially careful when using initializers in the newish `BLOCK` construct, | 
|  | which perpetuates the interpretation as static data. | 
|  | (Derived type component initializers, however, do work as expected.) | 
|  |  | 
|  | If you see an assignment to an array that's never been declared as such, | 
|  | it's probably a definition of a *statement function*, which is like | 
|  | a parameterized macro definition, e.g. `A(X)=SQRT(X)**3`. | 
|  | In the original Fortran language, this was the only means for user | 
|  | function definitions. | 
|  | Today, of course, one should use an external or internal function instead. | 
|  |  | 
|  | Fortran expressions don't bind exactly like C's do. | 
|  | Watch out for exponentiation with `**`, which of course C lacks; it | 
|  | binds more tightly than negation does (e.g., `-2**2` is -4), | 
|  | and it binds to the right, unlike what any other Fortran and most | 
|  | C operators do; e.g., `2**2**3` is 256, not 64. | 
|  | Logical values must be compared with special logical equivalence | 
|  | relations (`.EQV.` and `.NEQV.`) rather than the usual equality | 
|  | operators. | 
|  |  | 
|  | A Fortran compiler is allowed to short-circuit expression evaluation, | 
|  | but not required to do so. | 
|  | If one needs to protect a use of an `OPTIONAL` argument or possibly | 
|  | disassociated pointer, use an `IF` statement, not a logical `.AND.` | 
|  | operation. | 
|  | In fact, Fortran can remove function calls from expressions if their | 
|  | values are not required to determine the value of the expression's | 
|  | result; e.g., if there is a `PRINT` statement in function `F`, it | 
|  | may or may not be executed by the assignment statement `X=0*F()`. | 
|  | (Well, it probably will be, in practice, but compilers always reserve | 
|  | the right to optimize better.) | 
|  |  | 
|  | Unless they have an explicit suffix (`1.0_8`, `2.0_8`) or a `D` | 
|  | exponent (`3.0D0`), real literal constants in Fortran have the | 
|  | default `REAL` type -- *not* `double` as in the case in C and C++. | 
|  | If you're not careful, you can lose precision at compilation time | 
|  | from your constant values and never know it. |