A Note on Parameter Passing Mechanisms -------------------------------------- R. Dewar 24 Dec 1991 This note is an attempt to synthesize the issues involved in the discussion about parameter passing mechanisms. In Ada-83, implementations are given freedom to pass composite objects by reference or copy, and certain programs which depend on the difference are erroneous. In this note we will examine the Ada-83 rules, and possible adjustments to them for Ada-9X. The relevant sections of the Ada-83 RM are the following: 6.7(7) which provides the basic freedom to pass by reference or by copy, and the statement about erroneousness: "The execution of a program is erroneous if its effect depends on which mechanism is selected by the implementation" 6.7(12) [a Note] which deals with the exception case: "If the execution of a subprogram is abandoned as a result of an exception, the final value of an actual parameter of such a type can be either its value before the call, or a value assigned to the formal parameter during execution of the subprogram." 6.7(130 [a Note] which tries to explicate the erroneousness rule "If no actual parameter of such a type is accessible by more than one path, then the effect of a subprogram call (unless abandoned) is the same whether or not the implementation uses copying for parameter passing. If, however, there are multiple access paths to such a parameter (for example, if a global variable is undefined after updating the actual other than by updating the formal. A program using such an undefined value is erroneous." There are several issues to be dealt with. First let's try to understand both the intent and the actual meaning of the Ada-83 rules. 1. Current Ada-83 Rules ----------------------- 1.1 What is Call by Reference? ------------------------------ The first problem is to understand clearly what is meant by call by reference, and to understand what parameters can be passed by reference. The description in 6.7 if anything describes call by name, but that clearly is not intended, and informally (later confirmed by an AI), we may assume that call by reference means what it is expected to mean (i.e., in implementation terms, evaluate the address of the parameter and pass this address, and dereference to obtain the value within the subprogram). As to what parameters can be passed by reference, the RM is surprisingly non-restrictive, in that it allows any parameter other than a scalar to be passed by reference. In particular, the results of an expression such as: A & B (A,B two strings that happen to be contiguous in storage) can be passed by reference. This seems peculiar and non-intended, and a later AI has "interpreted" the intent of this section to only allow call by reference for names. Again this is no surprise, but Ada-9X needs to clean up the definition of what can be called by reference and what it means. 1.2 When are Programs Erroneous? -------------------------------- This is the hard issue. The intent behind the erroneousness rule was to allow compilers to proceed assuming that formal parameters are not aliased. This intent cannot be deduced from the reference manual, but, talking to the designers this was definitely the intent. One inspiration was FORTRAN which allows implementations similar freedom with the same intent. There are two problems in the existing formulation. First of all, what is an effect? In particular, does an effect have to be externally observable or can it simply be reflected in intermediate states of the computation. One example that has been used to discuss this is the following: procedure TEST is type ARR is array (1..10) of INTEGER; Q : ARR := ... function MAX (A : ARR) return INTEGER is begin -- code to sort the global array Q -- code to determine the maximum value of the array A return MAX_VALUE; end; begin PUT (MAX (Q)); end TEST; If Q is passed by copy, then the maximum value determination is done using the original value of the array. If Q is passed by reference, then the determination uses the sorted value of the array. However, determination of the maximum value gives the same result regardless of whether or not the array is sorted. Thus the observable effect of this program (the printed value) is independent of the parameter passing mechanism. Is the effect of the program dependent on the mechanism? Well we simply cannot give a firm answer to this question from the RM. Furthermore, it is not easy to formalize this problem so that a clear answer can be given either way. Probably the answer should be that the effect *does* include internal states of the computation, but we have no formal model for such internal states. This problem was first raised by the EEC formal definition project, but they did not suggest any satisfactory resolution. Evidence that the internal states should be considered may be found in the note of paragraph 13, which talks about a program "using" a value in question, as the condition for erroneousness. It is instructive to look at the FORTRAN rules which were one of the original inspirations for the Ada rules. In FORTRAN, the rule is that if there are two paths to a single variable (either two formals represent the same actual, or at least overlapping parts of the same actual, or a formal and a global overlap), then it is erroneous (adopting the Ada term) to assign via either path. It is interesting that Ada appeals to this notion of multiple paths in the note, but not in the main text. The clear advantage of the FORTRAN approach is that it is well defined. There is no argument as to whether or not a given execution is erroneous. In particular, the FORTRAN version of the above example would be clearly erroneous, because the attempt to sort Q would be erroneous at the point where it first assigned to Q. It is still hard of course for an implementation to test for violations of the FORTRAN rule, and implementations are not expected (nor attempt) to diagnose such errors, either statically or dynamically. It would be possible to construct a FORTRAN interpretor which did detect violations of this rule at runtime. A similar interpretor for Ada could not be constructed without a clearer definition of the semantics. 1.3 What About Aliasing? ------------------------ As mentioned previously, one of the intentions behind the Ada definition is to allow compilers to assume that parameters are not aliased, and to compile efficient code on this assumption. The FORTRAN rule is strong enough to permit this, but in its current form, the Ada rules are not. Consider: procedure TEST (A : in out ARR; B : in ARR) is begin A(4) := B(3); A(3) := B(2); end; TEST (Q,Q); In this case, it is clear that the effect of this program, executed in canonical order, does *not* depend on the parameter passing mechanism. It is true that the second assignment may affect the value of B(3). Indeed, the conclusions of the note of para 13 apply here, the value of B(3) is undefined after the assignment to A(3) since we may have updated the actual (Q) of the formal (B) other than by assigning to the formal. HOWEVER, we do *not* use this undefined value in the program, so the conclusion of the note does not apply. However, it is not true that a compiler can assume that A and B are not aliased within the body of the procedure. Under such an assumption, the two assignments could be interchanged: A(3) := B(2); A(4) := B(3); Such an interchange is clearly permitted if A and B are not aliased, since the interchange has no effect. However, in the context of the procedure call, the interchange certainly would have an effect, because now the reference to B(3) occurs after the assignment to A(3), and indeed would be affected by the parameter passing mechanism. This is a potentially serious problem because many compilers do in fact assume that formal parameters cannot be aliased. Furthermore, such assumptions are quite important to the generation of efficient code, and particularly to the analysis of loops to determine dependency relationships that inhibit possible vectorization. To illustrate the latter point, consider: procedure TEST (A : in out ARR; B : in ARR) is begin for I in 1 .. 100 loop A (I) := B(I + 1) * 3.14159; end loop; end; TEST (Q,Q); This in fact is nothing but a more extensive version of the first example with a loop. Like the first example, it is definitely not erroneous under the rules currently given in the RM. The issue is whether or not a compiler can parallelize the loop in a straightforward manner (e.g. by doing the computations on separate non-syncrhonized processors). If A and B cannot be aliased, the answer is definitely yes, but under Ada-83 rules, this loop can *not* be parallelized in this manner. In the case of non-parallelizing compilers, code motion is tending to become more extensive as the result of requirements for instruction scheduling in RISC processors (especially superscalar ones), so we may expect increasing problems from the clash between the Ada rules, and the expectations of compiler writers. Now of course there are cases where the erroneousness condition of the Ada-83 definition does give a compiler greater freedom. However, the analysis of whether or not two formal parameters can possibly be aliased becomes very complex, and even this analysis surely results in a pessimistic appraisal. At least some Ada vendors claim that severe damage would occur if they removed the (improper and invalid) assumption that parameters cannot be aliased. On the other hand, Norm Cohen has claimed that the aliasing allowed by the Ada-83 rules is potentially valuable (although he has not provided examples to substantiate this claim). At any rate, the bottom line is that there is a significant divergence between the intent and the effect of the definition, and also a significant divergence between the practice of compiler writers in assuming non-aliasing and the language definition as it stands. 1.4 A Note on the Note ---------------------- The note is of course only a note, and thus not part of the definition, but it is worth pointing out another flaw in its current formulation. Consider the example we looked at: procedure TEST (A : in out ARR; B : in ARR) is begin A(4) := B(7); A(3) := B(2); end; TEST (Q,Q); This does not seem to cause any problems. It's effect is not dependent on the parameter passing mechanism and a compiler assuming no aliasing will provide the correct semantics. However, according to the note, it is still erroneous (and indeed our original example with B(3) instead of B(7) is declared erroneous. This is because the first assignment in either case modifies the value of the formal B by assigning to its actual Q (assuming call by reference) without assigning to B. This is said in the note to render the value of the formal (B) undefined, and the second assignment statement does reference B. Now of course the point is that only the assigned subpart of the affected formal needs to become erroneous, not the whole variable. In the case of an array, it seems reasonable that only the assigned components be affected. If the note is retained in Ada-9X in any form, then this should be taken into consideration in its formulation. 1.5 The Exception Case ---------------------- The case of abandoning execution of a procedure is an interesting one. The clear intent, from the implementation model, and from the note, is that the exception is propagated and may skip the copy back for the case of call by copy. However, although we may deduce this effect from the implementation model described in paragraph 7 (as explicated by the relevant AI), the actual semantics are unclear. The wording of the note seems to suggest that the semantics is non-deterministic, but this seems at odds with the rule in paragraph 7 that bases erroneousness on effect. Consider the following example: procedure MAIN is type ARR is array (1..10) of INTEGER; Q : ARR := ARR'(others => 0); procedure TEST (A : in out ARR) is begin A(3) := 12; A(4) := 12 * INTEGER'LAST; end; begin TEST (Q); exception ... end MAIN; Now in the exception part, the issue is whether A(3) is set to zero or to 12, or to some other value. The note indicates that it is either set to zero or 12 (and these are the only possibilities). However, any program that read this value would seem to fall afoul of the erroneousness condition of paragraph 7. So we have the following: If the program does not depend in any way on the value of such a (possibly assigned) variable, then the note is irrelevant because it does not matter what the value is. If the program *does* depend on the the value, then it is erroneous, which means the compiler can do anything it likes, including choose any old value for the variable. Thus again the note is irrelevant [or just plain wrong]. It thus seems that the note of paragraph 12 is not consistent with the semantics, as best they can be understood. 2. Possible Solutions in Ada-9X ------------------------------- 2.1 Non-Deterministic Semantics ------------------------------- One proposed solution to the problem of determining what effect means is to replace the erroneousness rule of Ada-83 with a simple statement that the semantics are non-deterministic with respect to the parameter passing mechanism. This has often been suggested as an appropriate interpretation of the intent of the Ada-83 rules, but never specifically addressed by the ARG. It is interesting to note that paragraph 12 seems to suggest this semantics for Ada-83 in the exception case, although since this is only a note, it has no semantic force. A difficulty int his approach is that it increases the number of cases in which programs are non-erroneous and the compiler cannot assume that there is no aliasing of parameters. Consider the following example: procedure MAIN is type ARR is array (1..10) of INTEGER; Q : ARR := ARR'(for I in 1 .. 10 => I); -- being 9X for a moment! procedure TEST (A : in out ARR; B : in ARR) is begin A(3) := B(2) * B(4); A(4) := B(3); end; TEST (Q,Q); Now this example is clearly erroneous in Ada-83, which means that a compiler is free to do anything it wants, and in particular, it is allowed to invert the assignments to obtain: A(4) := B(3); A(3) := B(2) * B(4); which it might well do under the assumption that A and B cannot be aliased, which is a valid assumption for Ada-83 (since you can assume anything you like about an erroneous execution). The four possible results in practice are: Normal order, copy A(3) = 8, A(4) = 3 Normal order, reference A(3) = 8, A(4) = 8 Switched order, copy A(3) = 8, A(4) = 3 Switched order, reference A(3) = 6, A(4) = 3 It is the last line that causes trouble. Under the proposed non-deterministic semantics, we know that A(3) can only be set to 8. This means that the adoption of the non-deterministic approach, far from solving the problem of a compiler writer who wants to assume no aliasing, makes it worse. Compilers which assume no aliasing are "wronger" under these semantics, and if there is such a thing as a compiler for Ada-83 which is exactly correct, and takes advantage of non-aliasing assumptions only in provably erroneous cases, may become incorrect in Ada-83. The non-deterministic proposal thus seems unpleasant. Note that making it a bounded error still does not solve the problem, because a compiler is obligated to respect the specified bounds in the error situation. 2.2 Adopting the FORTRAN Approach --------------------------------- Another possibility for Ada-9X is to adopt the FORTRAN approach. Let's recall that this means introducing something like the following: An access path to a variable is defined as either visibility of a formal which is directly or indirectly (through a chain of calls) associated with the variable as an actual, or direct visibility to the variable via a global reference. If there are multiple access paths to a variable, then assignment to the variable using any of these access paths makes the execution of the program erroneous. Note that we don't want to try to replace this erroneousness by a bounded error, since then we fall into the trap discussed in 2.1, where the compiler still is too restricted. This of course needs further careful study. In particular we need to worry about the case of access variables. I think the above formulation says what we want in this case, but I am not sure! The other aspect that needs careful study is to determine whether or not this formulation does in fact permit compilers to make the no-aliasing rule. Certainly FORTRAN compilers routinely make this assumption, but the FORTRAN world is not so strict semantically as we are. 2.3 Appealing Directly to the Aliasing Property ------------------------------------------------ >From an optimization point of view, the property we want is that a compiler can assume that formal parameters are neither aliased with one another, nor aliased with any global variable. One possible approach is to simply incorporate this notion directly into the semantics. The difficulty is to formulate a semantic description that embodies the notion of aliasing. I don't see how to do this, but it is certainly worth considering as a possibility. Note that if this is done, there is a potential for confusion between the notion of aliasing thus introduced, and the keyword aliased as proposed in the Ada-9X mapping. If the keyword is retained, it had better mean the same thing [several reviewers have indicated a dislike for this choice of keyword in any case, although no consensus has been obtained for an alternative]. 2.4 Safety Critical Considerations and The 11.6 Route ----------------------------------------------------- The introduction of erroneousness is distasteful in any case, and doubly distasteful for safety critical programs. Of course we are not exactly introducing erroneousness, since it is there in Ada-83, and indeed the approach of 2.2 is a great improvement on Ada-83, since at least the situations in which erroneous execution occurs are well defined, and thus the problem of proving that a program is not erroneous in this respect is also well defined. However, the extra burden of such proofs is one of the things that safety critical programming style would prefer to eliminate. One possible approach is to designate the formal semantics of Ada as strictly call by copy for all composite objects, unless call by reference is specified (using what is at the current time called "aliased" in the Ada-9X proposal). The normal optimization freedoms, as currently embodied in 11.6, would then allow compilers to assume non-aliasing (either by persuing approach 2.2, or directly 2.3). Since we can turn off 11.6 (at least I assume this will be part of the safety critical support), the needs of safety critical programming will be addressed, while still retaining the appropriate compiler freedom in the normal optimized case. 3. Conclusions -------------- The Ada-9X rules for parameter passing fall seriously short of the intent, are ill-defined, and are not being followed by compilers. Things will get worse in the presence of energetic RISC instruction scheduling and more automatic parallelization, so it is important for Ada-9X to address this problem in a manner that satisfies the following needs: o The semantics is well defined o Compilers can generate efficient code in the normal optimized case o The needs of safety critical programming are met