!topic LSN on Limited Function Results in Ada 9X !key LSN-1043 on Limited Function Results in Ada 9X !reference MS-7.4.5;4.6 !reference LSN-1033 !from Bob Duff $Date: 92/10/12 10:07:20 $ $Revision: 1.2 $ !discussion This Language Study Note discusses the issues of parameter passing and function return for limited types. Limited types are very important in Ada 9X -- we expect them to be more commonly used than in Ada 83. Here are some examples of uses of limited types, several of which are new to Ada 9X: - task types and protected types - finalization (only allowed for limited types) - access discriminants, which allow one object to contain a (constant) reference to another object in the same scope. Self-reference is also possible. (Note that normal components do not work, because they are not constant, and therefore the accessibility rules need to be stricter.) - multiple inheritance (see LSN-1033) -- This is a particular sub-case of access discriminants. - any other abstraction that won't work properly if clients are allowed to make copies of objects. A limited type generally prevents copying. However, copying is not always prevented for parameter passing and function return. This leads to the question: What does it mean to pass a limited value as a parameter, or to return a limited value from a function? And the related question: What is the meaning of a limited value anyway? Ada 83 has limited types: task types, types FILE_TYPE in the I/O packages, and user-defined limited private types, as well as types composed from limited types. In Ada 83, returning a task outside its master has been ruled erroneous by the ARG, although RM83 doesn't say that. Returning a task type, but not outside its master, is required to work properly, which pretty much requires a by-reference implementation. In most implementations, a task value is represented as the address of its TCB, or something similar, so it is in fact by reference. FILE_TYPE must also be implemented as a pointer of some sort, so that returning a file from a function works properly. User-defined limited types are returned from functions by copy, thus breaking the limitedness property of the abstraction. The programmer will in practice achieve by-reference by defining the limited type as an access type. In Ada 9X, there are several more cases, as outlined above. We need to make sure that parameter passing and function return are by-reference. (For example, making a copy of a protected object is a disaster, in general!) However, in Ada 83, a limited private type whose completion is an elementary type, is required to be passed by copy. Also in Ada 83, function return is always by copy. For upward compatibility, we are planning to keep it that way for the existing kinds of types. In Ada 9X, some limited types are "inherently limited." (This concept was called "inherently aliased" in MS;4.6.) Some inherently limited types "cannot be moved". The inherently limited types are always limited -- there is no full_type_declaration that might cause them to become non-limited later. By-reference parameter passing and function return is always be used for these types. These are the limited types that are inherently limited: - a record type with the reserved word limited in its definition; - a limited tagged type - any type that has an access discriminant - task and protected types - any type derived from an inherently limited type, or with inherently limited components. Note that the above does not introduce any upward incompatibility, because the only thing in the above list that existed at all in Ada 83 is the task type, and task types have always used reference semantics. Other limited types retain the parameter passing and function return rules of Ada 83. Passing inherently limited values by reference is no big deal. However, what does it mean to return a function value by reference? First of all, for inherently limited types, the value of an object is inextricably linked to the object itself -- the whole point of limitedness is that you can't copy the value of one object into another object. Thus, in Ada 9X, one can think of the value of an inherently limited object as being pretty much the same thing as the object itself. If X and Y are two distinct inherently limited objects, then they can't both have the same value. If the result of a function is a value/object declared outside the function, the return by reference is no problem. A pointer to the outer object is returned. If, on the other hand, the function is returning a local object, then the implementation will generally want to MOVE the object (e.g. to a more global place on the stack), and then return a pointer to that object. Some inherently limited objects cannot be moved. An object cannot be moved if it might contain references to itself, or to other objects in the same declarative region. It is a bounded error to move such objects. Why? If an object is self-referential, then moving it to a different spot in memory will make the pointer-to-self wrong. (We do not want to require relocatable pointers of some sort.) If an object contains references to other objects in the same declarative region, then returning it outside that declarative region would make those pointers point to garbage. For the bounded error, either PROGRAM_ERROR is raised, or a value is returned that is associated with a temporary object of the same type, but that is accessible to the caller. This corresponds to two implementation strategies we wish to allow: detect the error and raise an exception, or else don't chop the stack back on return from this sort of function, leaving the unmovable objects in place, so all self- and within-same-scope- pointers still work. Making it a bounded error does, of course, introduce some non-uniformity across implementations. However, this is a case that was already ruled erroneous in Ada 83, so we don't think we're doing too much harm. If a function returns a local object that has finalization (and therefore is inherently limited), the finalization is deferred until later. We have not decided exactly how much later; and we don't think the answer is terribly important. Our current thinking is that it should happen no later than the end of the statement that did the function call, but allow implementations to do it earlier if the object is no longer in use. (Obviously, it is still in use if it has been passed (by reference!) to a subprogram.) We find it hard to believe that a program would want to depend on the exact point at which the finalization occurs, so long as it doesn't happen too early, and so long as every (initialized) object is finalized exactly once. In any case, we will document our final decision in the ILS. The implementation of the above rules is not difficult. Each return statement can detect the scope level of the returned object, and move it or not accordingly. For finalizable objects, one simple implementation strategy is to snip the object out of the per-task finalization list, and reinsert it at a more global point in the list. It seems somewhat undesirable to have three different categories of limited types. However, we believe they are all needed, given that functions can return limited types: - We can't get rid of limited types that are not inherently limited (i.e. make all limited types behave as described above for inherently limited types), because it would be upward inconsistent. - We can't get rid of inherently limited types, because we need some category of types that are always passed and returned by reference. - We can't get rid of "cannot be moved" types, because references and tasks are so important. In any case, getting rid of tasks would be rather upwardly incompatible! An alternative strategy would be to disallow functions that return inherently limited types. This would have some disadvantages: - It would not be upward compatible in the case of task types. - Since limited types are expected to be so common in Ada 9X, it would be a severe restriction to forbid functional notation for them. Consider, for example, a Set abstraction, which is inherently limited because it needs finalization. One very much wants to write this: return Union(Intersect(A, B), Intersect(C, D)); (or the equivalent using operator symbols), rather than this: Temp1 := A; Intersect(Temp1, B); Temp2 := C; Intersect(Temp2, D); Union(Temp1, Temp2); Result := Temp1; After all, it has been the job of compilers to allocate temporary variables for some decades now; it would be a giant step backward to hand that job back to the user. This alone would be enough to make many turn to a different language. Note that in the above example with function calls, the programmer is unlikely to care when finalization happens for the results of the inner Intersect calls, so long as the compiler-generated temps get cleaned up before too long, and certainly not while they're still in use. - We should make it as painless as possible to modify one's program, for example, to add finalization, or to add some other property that necessitates inherent limitedness. Requiring the programmer to change all function calls into procedure calls would be an onerous burden. Thus, we believe that inherently limited objects should be first-class citizens -- in particular, they should be allowed as function result types. We believe we have achieved a reasonably simple and implementable semantics for them, as described above.