Chapter summary for MS -7 ;4.6 ------------- !topic Visibility of record type extensions !from Bill Taylor 1992-06-28 11:12:54 <> !reference MS-7.4.1();4.6 !discussion In the following package P is type T is tagged private; private type T is tagged record A : Integer; end record; end P; -------------------------------------- with P; package Q is type T2 is new T with record A : Float; end record; end Q; -------------------------------------- with Q; package body P is R : Q.T2; begin R.A := 15; R.A := 1.5; end Q; -------------------------------------- On the assumption that the two package specs are legal, which (if any) of the two assignments in the package body are legal and why? ------------- !topic LSN on Deferred Constants of Any Type !from Bob Duff $Date: 1992-10-10 09:30:27 <> !reference MS-7.4.4();4.6 !discussion This Language Study Note discusses the Ada 9X ability to declare deferred constants of any type. In Ada 83, the type of a deferred constant must be a private type, and that private type must be declared in the same package. Ada 9X removes this restriction -- a deferred constant can be of any type. The following discussion lists several reasons for removing the restriction. Although MS;4.6 does not say so, we also intend to remove the requirement to place a deferred constant in a package, in support of pragma Import (see below) -- the ILS will make this clear. The most obvious reason for removing the restriction is that Ada programmers view it as an "arbitrary restriction", and, indeed, there is no semantic need for the restriction. Arbitrary restrictions frustrate programmers, and give Ada a bad name. The most important reason for removing the restriction is for full support of pragma Import. One major reason for projects to choose C or C++ over Ada is that those languages have many bindings to useful off-the-shelf software. One of the reasons for the lack of such bindings in Ada is that the interface-to-other-languages features of Ada 83 are weak and very non-portable. Therefore, one of the main goals of Ada 9X is to provide the easiest possible interfaces to other languages. Although such interfaces can never be completely portable, we are trying to make them as portable as possible. In Ada 9X, one can apply pragma Import to a deferred constant. The pragma replaces the full_constant_declaration. This allows the import of read-only data into an Ada program -- an important capability. Obviously, any data type might be needed, and there is no particular reason that requires the data to be declared in a package. Deferred constants are important in child packages. A parent package might declare a private type, and a visible child package might need to export a constant of that type. That constant will need to be deferred, since the full_type_declaration is not visible in the visible part of the visible child. But the Ada 83 rule would make such a deferred constant illegal. One use for a deferred constant in Ada 9X is to export constant access to an (aliased) variable declared in the private part of a package: package P is type Table is array(...) of ...; type Table_Access is access constant Table; This_Table_Access: constant Table_Access; That_Table_Access: constant Table_Access; private This_Table: aliased Table := (...); This_Table_Access: constant Table_Access := This_Table'Access; That_Table: aliased Table := (...); That_Table_Access: constant Table_Access := That_Table'Access; end P; Now, the body of package P can change the tables, but clients of the package can only read them. ---------------- Except for the pragma Import example, all of the above uses for deferred constants can be done with functions. But if one suggests the use of functions, one has to question the need for any constants in Ada -- a function will always suffice. Constants exist in Ada because functions are so much more verbose when all you want is a simple constant object. Deferred constants are no different -- they are needed for the same reasons as normal constants, but they are used when, for one reason or another (as shown by the above examples), the initialization expression needs to see things that are not visible at the point of the (original) declaration of the constant. Requiring the user to write a function in such cases is the sort of annoyance that turns people away from a language. Furthermore, if the function is not inlined, then it is inefficient. If it IS inlined, then extra compilation dependences are added, possibly causing severe recompilation costs. And most implementations of inlining are still more costly than a reference to a constant. SUMMARY: Deferred constants are important for interfacing to other languages. They are also useful in various situations described above -- any situation in which the visibility of the constant declaration needs to be different from the visibility needed by the initialization expression. Functions cannot really replace deferred constants very well. And finally, deferred constants would be crippled in Ada 9X by the existence of a restriction inherited from Ada 83. Thus, the MRT recommends removal of the restriction. ------------- !topic LSN on Deferred Constants of Any Type !from Norman Cohen 1992-10-13 14:32:23 <> !reference MS-7.4.4();4.6 !reference LSN-1038 !discussion Bob Duff writes: > But if one suggests the use of > functions, one has to question the need for any constants in Ada -- a > function will always suffice. Constants exist in Ada because functions > are so much more verbose when all you want is a simple constant object. > Deferred constants are no different -- they are needed for the same > reasons as normal constants, but they are used when, for one reason or > another (as shown by the above examples), the initialization expression > needs to see things that are not visible at the point of the (original) > declaration of the constant. Requiring the user to write a function in > such cases is the sort of annoyance that turns people away from a > language. > > Furthermore, if the function is not inlined, then it is inefficient. If > it IS inlined, then extra compilation dependences are added, possibly > causing severe recompilation costs. And most implementations of > inlining are still more costly than a reference to a constant. There is yet another reason that functions are not adequate replacements for deferred constants: A deferred constant can be referenced anywhere in the private part after the full declaration; and in a client of the package even before the package body is elaborated. A function used in this manner raises Program_Error (access before elaboration). ------------- !topic Access discriminants during returns of limited values !from Bevin Brett 1992-06-03 16:05:38 <> !reference MS-7.4.5();4.6 !keywords Limited, Access, !discussion I assume the restriction of TYPE_NAME'ACCESS to tagged extensions is to avoid fun during pass-by-copy. Never-the-less the trouble has not been completely suppressed... [ps: it is unclear why the restriction to tagged EXTENSIONS as opposed to any tagged types. The following example shows a silly way around this restriction]. package PKG is type T_DONT_USE_ME is limited tagged record null; end record; type T; type SECOND(D : access T) is new System.Controlled with record ... end record; type T is new T_DONT_USE_ME with record C : SECOND(T1'access); end record; function F return T; end; package body F is function F return T is begin loop declare Ts : array(1..1000000) of T; begin if Random then return Ts(Random); end if; end; end loop; end; end; It is going to be VERY hard to find all the access-discriminants inside Ts(Random) and make them point at the appropriate new copy of T that has been moved outside F's lifetime into the place where its lifetime is long enough... ------------- !topic Finalization and Limited Function Results !from Stef Van Vlierberghe 1992-10-15 23:02:20 <> !reference MS-7.4.5();4.6 !reference LSN-1033 !reference LSN-1043 !discussion > Proposal A is far too complex, and introduces too many semantic > anomolies. (What, for example, happens if a discriminant is of a > finalizable type? Discriminants are not passed by reference, > surely.) This is not fair. We are supposed to accept the restriction that finalization can be used only for derivatives of a specific limited tagged type, while the alternatives are implicitly supposed to support any type, including scalar types (which *even* C++ doesn't allow you to fiddle with). Proposal A is user-defined assignment and finalization, definitely restricted to non-scalar types. This is again supporting the (IMHO false) assumption that the MS;2.0 proposal cannot be tailored down to some reasonable complexity size. Being pragmatic, I wouldn't really mind if it were restricted to non-tagged private types completed with non-limited constrained record types. Non-limited meaning user-defined assignment isn't a tool to remove the property of limitedness. > Some inherently limited objects cannot be moved. An object cannot be > moved if it might contain references to itself, or to other objects > in the same declarative region. It is a bounded error to move such > objects. User-defined assignment could do the copy correctly, permit to pop the stack on function return, and make this non-movable inherently limited object (poor sod) a true (movable, non-limited) first-class citizen. > We should make it as painless as possible to modify one's program, > for example, to add finalization, or to add some other property > that necessitates inherent limitedness. Requiring the programmer > to change all function calls into procedure calls would be an > onerous burden. I consider this a good argument against coupling finalization with limited- and tagged-ness, which is an onerous burdon too. But as you said, non-limited finalization implies, yes... > Thus, we believe that inherently limited objects should be > first-class citizens -- in particular, they should be allowed as > function result types. We believe we have achieved a reasonably > simple and implementable semantics for them, as described above. When facing seconds-class citizens (such as unconstrained types, class-wide types, and inherently limited types) the application programmer can always turn them into first-class citizens, using some kind of reference, typically an access type. Currently this is a false argument, the reference must get user-defined finalization, and therefor it gets limited and tagged, and is no longer first-class itself. Alternatively, the reference could get user-defined assignment and finalization, not become limited or tagged, and be first-class. In that case most programmers would stop asking for varying strings, class-wide objects that can change class, garbage collection, and the magical function return of unmovable objects. Sorry to be so persistent about it, but each time I see some unsafe design out of fear from limitedness, I hope things could be different. ------------- !topic LSN on Limited Function Results in Ada 9X !from Bob Duff $Date: 1992-10-10 16:23:19 <> !reference MS-7.4.5();4.6 !reference LSN-1033 !discussion This Language Study Note discusses the issues of parameter passing and function return for limited types. Limited types are very important in Ada 9X -- we expect them to be more commonly used than in Ada 83. Here are some examples of uses of limited types, several of which are new to Ada 9X: - task types and protected types - finalization (only allowed for limited types) - access discriminants, which allow one object to contain a (constant) reference to another object in the same scope. Self-reference is also possible. (Note that normal components do not work, because they are not constant, and therefore the accessibility rules need to be stricter.) - multiple inheritance (see LSN-1033) -- This is a particular sub-case of access discriminants. - any other abstraction that won't work properly if clients are allowed to make copies of objects. A limited type generally prevents copying. However, copying is not always prevented for parameter passing and function return. This leads to the question: What does it mean to pass a limited value as a parameter, or to return a limited value from a function? And the related question: What is the meaning of a limited value anyway? Ada 83 has limited types: task types, types FILE_TYPE in the I/O packages, and user-defined limited private types, as well as types composed from limited types. In Ada 83, returning a task outside its master has been ruled erroneous by the ARG, although RM83 doesn't say that. Returning a task type, but not outside its master, is required to work properly, which pretty much requires a by-reference implementation. In most implementations, a task value is represented as the address of its TCB, or something similar, so it is in fact by reference. FILE_TYPE must also be implemented as a pointer of some sort, so that returning a file from a function works properly. User-defined limited types are returned from functions by copy, thus breaking the limitedness property of the abstraction. The programmer will in practice achieve by-reference by defining the limited type as an access type. In Ada 9X, there are several more cases, as outlined above. We need to make sure that parameter passing and function return are by-reference. (For example, making a copy of a protected object is a disaster, in general!) However, in Ada 83, a limited private type whose completion is an elementary type, is required to be passed by copy. Also in Ada 83, function return is always by copy. For upward compatibility, we are planning to keep it that way for the existing kinds of types. In Ada 9X, some limited types are "inherently limited." (This concept was called "inherently aliased" in MS;4.6.) Some inherently limited types "cannot be moved". The inherently limited types are always limited -- there is no full_type_declaration that might cause them to become non-limited later. By-reference parameter passing and function return is always be used for these types. These are the limited types that are inherently limited: - a record type with the reserved word limited in its definition; - a limited tagged type - any type that has an access discriminant - task and protected types - any type derived from an inherently limited type, or with inherently limited components. Note that the above does not introduce any upward incompatibility, because the only thing in the above list that existed at all in Ada 83 is the task type, and task types have always used reference semantics. Other limited types retain the parameter passing and function return rules of Ada 83. Passing inherently limited values by reference is no big deal. However, what does it mean to return a function value by reference? First of all, for inherently limited types, the value of an object is inextricably linked to the object itself -- the whole point of limitedness is that you can't copy the value of one object into another object. Thus, in Ada 9X, one can think of the value of an inherently limited object as being pretty much the same thing as the object itself. If X and Y are two distinct inherently limited objects, then they can't both have the same value. If the result of a function is a value/object declared outside the function, the return by reference is no problem. A pointer to the outer object is returned. If, on the other hand, the function is returning a local object, then the implementation will generally want to MOVE the object (e.g. to a more global place on the stack), and then return a pointer to that object. Some inherently limited objects cannot be moved. An object cannot be moved if it might contain references to itself, or to other objects in the same declarative region. It is a bounded error to move such objects. Why? If an object is self-referential, then moving it to a different spot in memory will make the pointer-to-self wrong. (We do not want to require relocatable pointers of some sort.) If an object contains references to other objects in the same declarative region, then returning it outside that declarative region would make those pointers point to garbage. For the bounded error, either PROGRAM_ERROR is raised, or a value is returned that is associated with a temporary object of the same type, but that is accessible to the caller. This corresponds to two implementation strategies we wish to allow: detect the error and raise an exception, or else don't chop the stack back on return from this sort of function, leaving the unmovable objects in place, so all self- and within-same-scope- pointers still work. Making it a bounded error does, of course, introduce some non-uniformity across implementations. However, this is a case that was already ruled erroneous in Ada 83, so we don't think we're doing too much harm. If a function returns a local object that has finalization (and therefore is inherently limited), the finalization is deferred until later. We have not decided exactly how much later; and we don't think the answer is terribly important. Our current thinking is that it should happen no later than the end of the statement that did the function call, but allow implementations to do it earlier if the object is no longer in use. (Obviously, it is still in use if it has been passed (by reference!) to a subprogram.) We find it hard to believe that a program would want to depend on the exact point at which the finalization occurs, so long as it doesn't happen too early, and so long as every (initialized) object is finalized exactly once. In any case, we will document our final decision in the ILS. The implementation of the above rules is not difficult. Each return statement can detect the scope level of the returned object, and move it or not accordingly. For finalizable objects, one simple implementation strategy is to snip the object out of the per-task finalization list, and reinsert it at a more global point in the list. It seems somewhat undesirable to have three different categories of limited types. However, we believe they are all needed, given that functions can return limited types: - We can't get rid of limited types that are not inherently limited (i.e. make all limited types behave as described above for inherently limited types), because it would be upward inconsistent. - We can't get rid of inherently limited types, because we need some category of types that are always passed and returned by reference. - We can't get rid of "cannot be moved" types, because references and tasks are so important. In any case, getting rid of tasks would be rather upwardly incompatible! An alternative strategy would be to disallow functions that return inherently limited types. This would have some disadvantages: - It would not be upward compatible in the case of task types. - Since limited types are expected to be so common in Ada 9X, it would be a severe restriction to forbid functional notation for them. Consider, for example, a Set abstraction, which is inherently limited because it needs finalization. One very much wants to write this: return Union(Intersect(A, B), Intersect(C, D)); (or the equivalent using operator symbols), rather than this: Temp1 := A; Intersect(Temp1, B); Temp2 := C; Intersect(Temp2, D); Union(Temp1, Temp2); Result := Temp1; After all, it has been the job of compilers to allocate temporary variables for some decades now; it would be a giant step backward to hand that job back to the user. This alone would be enough to make many turn to a different language. Note that in the above example with function calls, the programmer is unlikely to care when finalization happens for the results of the inner Intersect calls, so long as the compiler-generated temps get cleaned up before too long, and certainly not while they're still in use. - We should make it as painless as possible to modify one's program, for example, to add finalization, or to add some other property that necessitates inherent limitedness. Requiring the programmer to change all function calls into procedure calls would be an onerous burden. Thus, we believe that inherently limited objects should be first-class citizens -- in particular, they should be allowed as function result types. We believe we have achieved a reasonably simple and implementable semantics for them, as described above. ------------- !topic LSN on Limited Function Results in Ada 9X !from Ted Baker 1992-10-12 07:24:10 <> !reference MS-7.4.5();4.6 !reference LSN-1033 !reference LSN-1043 !discussion <> Bob Duff says: | Thus, we believe that inherently limited objects should be | first-class citizens -- in particular, they should be allowed as | function result types. We believe we have achieved a reasonably | simple and implementable semantics for them, as described above. I strongly disagree with this conclusion, and would like to rebut the arguments given in the LSN. | An alternative strategy would be to disallow functions that return | inherently limited types. This would have some disadvantages: | - It would not be upward compatible in the case of task types. No essential need for functions that return task types has yet been demonstrated. In fact, the troublesome case of functions that return dependent tasks is known to be an ugly wart, all the more ugly because there is no need to return a task from a function in the first place. We already have access-to-task types, and in 9X we will have a task-class or task-id type for less specifically typed references to tasks. In 9X, we also have much easier ways of getting access values, with 'ACCESS and 'UNCHECKED_ACCESS. | - Since limited types are expected to be so common in Ada 9X, | it would be a severe restriction to forbid functional notation | for them. .... | After all, it has been the job of compilers to allocate | temporary variables for some decades now; it would be a giant | step backward to hand that job back to the user. This alone | would be enough to make many turn to a different language. This is inconsistent with the argument used by the MRT against support of type extension models that require variable-size objects (also parameters and function return values) (reference: 92-1532.a Tucker Taft 92-9-24). A quotation from STT: | In general, we believe that if you define one class-wide pointer | type (e.g. type Window_Ptr is access all Window'CLASS), and use | that for your stacks, etc., then you can get essentially all of | the simplicity you long for, with the only requirement that you | have to use 'ACCESS now and then when you have an object, and you | want a pointer to it, or ".all" to go the other way. If most | objects are created by allocators, then 'ACCESS should rarely if | ever be needed. The argument given then was that it is better for users to have to explicitly deal with access types, than for the compiler to automatically generate an extra level of indirection and manage the deferred storage recovery problems. It seems to me that the same argument should apply with functions returning inherently limited types. If requiring the user to deal with explicit indirection is good (or at least not too bad) for variable-sized objects, then what is wrong with it here? | - We should make it as painless as possible to modify one's program, | for example, to add finalization, or to add some other property | that necessitates inherent limitedness. Requiring the programmer | to change all function calls into procedure calls would be an | onerous burden. This is true, but seems like a weak argument, taken in context. First, the problem is avoided if functions are written initially using access types, or procedures are used. Second, this is not the only place we are making it hard to modify a program. What if a user wants to add "protectedness"? or extend a type that is already protected? or add finalization to a type that is already defined by derivation from some type without finalization? Alternatively, maybe we should reconsider the subjects of variable-sized objects, extension of protected types, and methods of providing finalization for types defined via extension of other types. --Ted Baker ------------- !topic LSN on Limited Function Results in Ada 9X !from Tucker Taft 1992-10-13 15:13:10 <> !reference MS-7.4.5();4.6 !reference 92-1594.a !reference LSN-1033 !reference LSN-1043 !discussion > . . . > | - Since limited types are expected to be so common in Ada 9X, > | it would be a severe restriction to forbid functional notation > | for them. .... > | After all, it has been the job of compilers to allocate > | temporary variables for some decades now; it would be a giant > | step backward to hand that job back to the user. This alone > | would be enough to make many turn to a different language. > > This is inconsistent with the argument used by the MRT against > support of type extension models that require variable-size > objects (also parameters and function return values) (reference: > 92-1532.a Tucker Taft 92-9-24). Allocating temporary variables is not the same thing as supporting deallocation and reallocation of variable-sized values as part of an assignment. Temporary variables can be handled with a stack-based or a mark/release approach. Deallocation and reallocation require explicit use of the heap. We consistently believe that the compiler should handle temporaries, even of compile-time-unknown size, presuming that a stack-based or mark/release strategy is sufficient, but that it should not be required to support operations that inevitably require implicit use of the heap. It is true that some compilers make implicit use of the heap for some or all compile-time-unknown-size temporaries, but this is not necessary, as has been known since the ALGOL 60 days. However, there are also compilers that currently make no implicit use of the heap. We believe that this is an important (and appropriate) way to implement Ada, and we believe it is important to preserve this possibility. > . . . > The argument given then was that it is better for users to have to > explicitly deal with access types, than for the compiler to > automatically generate an extra level of indirection and manage > the deferred storage recovery problems. It seems to me that the > same argument should apply with functions returning inherently > limited types. We believe these are fundamentally different situations as explained above. Particularly when finalization is added into the situation, no automatic finalization would happen with access types, whereas automatic finalization of function return values is extremely useful and critical to certain programming paradigms to avoid storage leakage. Admittedly there is a slippery slope when it comes to having the compiler "do" things for the user, but we believe Ada 83 staked out a consistent point on that slope, namely one that did not require implicit use of the heap, and we support that position and have tried to use it as a dividing line to distinguish possible features for Ada 9X. > --Ted Baker -Tuck ------------- !topic LSN on Limited Function Results in Ada 9X !from Robert I. Eachus 1992-10-13 16:53:43 <> !reference MS-7.4.5();4.6 !reference 92-1594.a !reference LSN-1033 !reference LSN-1043 !discussion Maybe a better way to resolve the incompatibility is to recognize that in all current Ada 83 implementations there is no particular reason for task objects to be limited objects. (As far as I know all implementations acutally use pointers to task control blocks to implement task objects.) If task objects in Ada 9X were not limited, there would be no real incompatibility, and there would no longer be any inherently limited objects in Ada 83. Robert I. Eachus with Standard_Disclaimer; use Standard_Disclaimer; function Message (Text: in Clever_Ideas) return Better_Ideas is... ------------- !topic LSN on Limited Function Results in Ada 9X !from Brian Dobbing 1992-10-14 20:22:32 <> !reference MS-7.4.5();4.6 !reference 92-1594.a !reference LSN-1033 !reference LSN-1043 !discussion I want to back up Ted's mail saying that returning local limited objects from functions is really not something which is THAT important that we've got to expend lots of effort to support it. In my view, such support is out of scope in a ZBB context and it should be left as erroneous as ruled by the ARG for Ada83 local tasks. > Some inherently limited objects cannot be moved. > For the bounded error, either PROGRAM_ERROR is raised, or a value is > returned that is associated with a temporary object of the same type, > but that is accessible to the caller. This corresponds to two > implementation strategies we wish to allow: detect the error and > raise an exception, or else don't chop the stack back on return from > this sort of function, leaving the unmovable objects in place This raises the issue of needing to detect movable and non-movable objects at compile-time, an unnecessary burden. > If a function returns a local object that has finalization (and > therefore is inherently limited), the finalization is deferred until > later. > The implementation of the above rules is not difficult. Each return > statement can detect the scope level of the returned object, and > move it or not accordingly. For finalizable objects, one simple > implementation strategy is to snip the object out of the per-task > finalization list, and reinsert it at a more global point in the list. Although Bob thinks this is not difficult, it does imply determining the "correct" lifetime of the function result which may be passed through many layers of function call. For Ada83 local tasks, one could safely pessimistically choose to keep the TCB address valid until the scope corresponding to the task type declaration (knowing that this case would never be coded outside the ACVC anyway), but for 9X local limited objects in general, this would be unacceptable of course. Also, in Ada83 for functions returning unconstrained types, this fact was known by the caller and so the correct scope id could be passed in as a parameter. But for local limited returns, the caller doesn't know whether the result is going to be local or not, and we certainly don't want to take a pessimistic view here and ALWAYS pass an extra parameter which will never be used in practice. Also, "snipping the object out of the per-task finalization list" sounds to me like it implies doubly-linked lists, which might not be needed otherwise. I don't want to introduce double-linking just for this pathological case. > An alternative strategy would be to disallow functions that return > inherently limited types. This would have some disadvantages: Clearly this is too severe. All we are saying is to disallow functions returning LOCAL inherently limited values (ie. by making it erroneous). The "workaround" of using access types and heap storage, and returning access values, seems perfectly reasonable to me. -- Brian ------------- !topic LSN on Limited Function Results in Ada 9X !from Norman Cohen 1992-10-15 19:11:54 <> !reference MS-7.4.5();4.6 !reference 92-1594.a !reference 92-1621.a !reference LSN-1033 !reference LSN-1043 !discussion Brian Dobbing writes: > I want to back up Ted's mail saying that returning local limited > objects from functions is really not something which is THAT important > that we've got to expend lots of effort to support it. In my > view, such support is out of scope in a ZBB context and it should be > left as erroneous as ruled by the ARG for Ada83 local tasks. This would, of course, be a flagrant upward incompatibility. Here is one paradigm in which Ada 83 programs may make heavy use of the ability to return a local limited object: package VStrings is type VString (Max_Size: Natural) is limited private; ... function "&" (Left, Right: VString) return VString; ... private type Vstring (Max_Size: Natural) is record Current_Length: Natural; Data: String (1 .. Max_Size); end record; end VStrings; package body VStrings is ... function "&" (Left, Right: VString) return VString is Result: Vstring (Left.Current_Length+Right.Current_Length); begin Result.Data := Left.Data & Right.Data; return Result; end "&"; ... end Vstrings; Strictly speaking, Result is not limited at the point of the return, but it is limited at the point of the function call, so all the usual problems arise. Here is another example, to control access rights to a file within different parts of a large program: type Restricted_File is record File_Part : File_Type; Restrictions_Part : Restrictions_Type; end record; ... function New_Restricted_File (Name: String; Restrictions: Restrictions_Type) is Result : Restricted_File; begin Open (Result.File_Part, In_File, Name); Result.Restrictions_Part := Restrictions; return Result; -- a local limited object end New_Restricted_File; Here, even if Restricted_File were a limited private type, it would be limited within the scope of the full type declaration because it has a limited component. ------------- !topic LSN on Limited Function Results in Ada 9X !from Ted Baker 1992-10-26 12:36:23 <> !reference MS-7.4.5();4.6 !reference 92-1594.a !reference 92-1621.a !reference 92-1626.a !reference LSN-1033 !reference LSN-1043 !discussion Norm Cohen cites two examples of Ada 83 paradigms that make heavy use of the ability to return a local limited object. One is a heterogeneous-length string type, and another is a restricted-access file type. I have some questions on whether limited types are crucial to these paradigms, or just artifacts introduced in the examples. | package VStrings is | type VString (Max_Size: Natural) is limited private; | ... | function "&" (Left, Right: VString) return VString; | ... | private | type Vstring (Max_Size: Natural) is | record | Current_Length: Natural; | Data: String (1 .. Max_Size); | end record; | end VStrings; What is gained by making Vstring be limited private in this package? i.e. why should it not be simply private, or fully exposed? | type Restricted_File is | record | File_Part : File_Type; | Restrictions_Part : Restrictions_Type; | end record; | ... | function New_Restricted_File | (Name: String; Restrictions: Restrictions_Type) is | Result : Restricted_File; | begin | Open (Result.File_Part, In_File, Name); | Result.Restrictions_Part := Restrictions; | return Result; -- a local limited object | end New_Restricted_File; What can one do with the value returned by New_Restricted_File? It does not seem useful. --Ted ------------- !topic LSN on Limited Function Results in Ada 9X !from Ted Baker 1992-10-26 13:31:02 <> !reference MS-7.4.5();4.6 !reference 92-1594.a !reference 92-1621.a !reference 92-1626.a !reference 92-1683.a !reference LSN-1033 !reference LSN-1043 !discussion > Norm Cohen cites two examples of Ada 83 paradigms that make heavy > use of the ability to return a local limited object. One is a > heterogeneous-length string type, and another is a > restricted-access file type. I have some questions on whether > limited types are crucial to these paradigms, or just artifacts > introduced in the examples. > > | package VStrings is > | type VString (Max_Size: Natural) is limited private; > | ... > | function "&" (Left, Right: VString) return VString; > | ... > | private > | type Vstring (Max_Size: Natural) is > | record > | Current_Length: Natural; > | Data: String (1 .. Max_Size); > | end record; > | end VStrings; > > What is gained by making Vstring be limited private in this > package? i.e. why should it not be simply private, or fully > exposed? It should be made limited private for the usual reasons: 1. Predefined "=" doesn't work as desired: It compares Max_Size components, so that Vstring objects with identical contents but different maximum sizes never compare equal. Furthermore, it compares Left.Data(1 .. Left.Max_Size) with Right.Data(1 .. Right.Max_Size) instead of comparing Left.Data(1 .. Left.Current_Length) with Right.Data(1 .. Right.Current_Length). (That is, garbage bytes are taken into account, causing representations of the same abstract VString value to compare unequal. 2. Predefined assignment doesn't work as desired: If you try to assign a VString value with one maximum size to a container with a different maximum size, the violation of the discriminant constraint raises Constraint_Error, even if the abstract value being assigned would fit in the target container. Even when the source and target have the same maximum size, predefined assignment unnecessarily copies garbage bytes along with the meaningful ones. > | type Restricted_File is > | record > | File_Part : File_Type; > | Restrictions_Part : Restrictions_Type; > | end record; > | ... > | function New_Restricted_File > | (Name: String; Restrictions: Restrictions_Type) is > | Result : Restricted_File; > | begin > | Open (Result.File_Part, In_File, Name); > | Result.Restrictions_Part := Restrictions; > | return Result; -- a local limited object > | end New_Restricted_File; > > What can one do with the value returned by New_Restricted_File? > It does not seem useful. Presumably there would be other operations, e.g. Put, Get, Set_Input, Set_Output, etc. acting like their Text_IO analogs, but on type Restricted_File isntead of File_Type. Thus a plausible use would be: Set_Output (New_Restricted_File("BAKER.REPLY"), ASCII_Only); This example was meant as a prototype for any limited type whose values have certain properties stored in record components. It is not unusual to have a constructor function taking such properties as arguments and returning a value of the limited type having those properties. A reasonable way to implement such a function is to declare a local record object, set its components appropriately, and return the local object. Remember, the issue is not whether there is a way to write programs without returning local limited objects. The issue is whether there reasonable Ada 83 programmers may have done so. It is clear that there are many reasonable occasions to return local limited objects in Ada 83. To make this usage illegal in Ada 9X would be an unacceptable incompatibility. Norman ------------- !topic LSN on Limited Function Results in Ada 9X !from Ted Baker 1992-10-27 09:49:04 <> !reference MS-7.4.5();4.6 !reference 92-1594.a !reference 92-1621.a !reference 92-1626.a !reference 92-1683.a !reference 92-1684.a !reference LSN-1033 !reference LSN-1043 !discussion Thanks for the clarification of intended usage. It seems these two examples are a bit different semantically. In the case of the Vstrings package you would be satisified if the function returned a copy of the local object. That is, it would be OK to "finalize" and deallocate the local object so long as a copy of it can be returned. There is no reason to retain the storage originally occupied by the object. With the Restricted_File example, it seems you would not want to allow returning a copy of the limited object, and releasing its storage before the return, if there could still be dangling references to it from OS data structures. Presumably, finalization of such an object may be needed, to close the connection to the OS. I still don't see how this programming paradigm is useful, since it seems that without assignment you are forced to throw away each file as soon as you perform one operation on it. You give an example: | Set_Output (New_Restricted_File("BAKER.REPLY"), ASCII_Only); In this case, the new restricted file would go away immediately after the operation. What am I missing? | This example was meant as a prototype for any limited type whose values | have certain properties stored in record components. It is not unusual | to have a constructor function taking such properties as arguments | and returning a value of the limited type having those properties. | A reasonable way to implement such a function is to declare a local | record object, set its components appropriately, and return the local | object. I cannot see this with a function, since you have no way of preserving the value. (It would make sense for a procedure with out-parameter, but that's not what we are talking about.) --Ted ------------- !topic LSN on Limited Function Results in Ada 9X !from Norman Cohen 1992-10-27 12:14:52 <> !reference MS-7.4.5();4.6 !reference 92-1594.a !reference 92-1621.a !reference 92-1626.a !reference 92-1683.a !reference 92-1684.a !reference 92-1702.a !reference LSN-1033 !reference LSN-1043 !discussion > I still don't see how this programming paradigm is useful, since > it seems that without assignment you are forced to throw away each > file as soon as you perform one operation on it. > > You give an example: > > | Set_Output (New_Restricted_File("BAKER.REPLY"), ASCII_Only); > > In this case, the new restricted file would go away immediately > after the operation. What am I missing? This is analogy to Text_IO facilities that exist in Ada 83 and have been found to be useful. Among these facilities are: * A limited private type File_Type. * Functions like Standard_Output that return File_Type values. * A procedure Set_Output that takes a File_Type parameter and makes that file the current output file. * Versions of Put, etc., that do not take a File_Type parameter but operate instead on the current input or output file. The effect of a call on Text_IO.Set_Output persists after the end of the call. Perhaps the body of Text_IO has a hidden variable in which it stores the File_Type value passed to it, so it knows which file to use when Put is called without a File_Type parameter. The body of Set_Output can assign the File_Type parameter to this variable if the full declaration of File_Type is such that the type is not limited within the Text_IO body. My example was of a limited private type Restricted_File with the following full declaration: type Restricted_File is record File_Part : File_Type; Restrictions_Part : Restrictions_Type; end record; The version of Set_Output that works in restricted files could be implemented within the corresponding package body as follows: Default_Output_Restrictions: Restrictions_Type; -- variable declared in package body procedure Set_Output (RF: in Restricted_File) is begin Text_IO.Set_Output (RF.File_Part); Default_Output_Restrictions := RF.Restrictions_Part; end Set_Output; Then the Restricted_File version of Put that does not take a Restricted_File parameter looks at the variable Default_Output_Restrictions to determine which restrictions to check and calls the corresponding version of Text_IO.Put that acts upon the current default output file. > | This example was meant as a prototype for any limited type whose values > | have certain properties stored in record components. It is not unusual > | to have a constructor function taking such properties as arguments > | and returning a value of the limited type having those properties. > | A reasonable way to implement such a function is to declare a local > | record object, set its components appropriately, and return the local > | object. > > I cannot see this with a function, since you have no way of > preserving the value. (It would make sense for a procedure with > out-parameter, but that's not what we are talking about.) The point is that limited function results can be passed as parameters to procedures, including procedures declared in the same package as the limited private type. These procedures may preserve the value of the limited parameter, either as part of the state of the package, as part of the collective state of a number of packages (as in the Restricted_File example), or in an out or in out parameter. The most common of such procedures is a user-defined copy procedure, typically named Assign or Copy or Set. Within the body of the package, such procedures may well be implemented with assignment statements; or, if the limited private type is implemented as a record type with limited components, by invoking user-defined copy procedures for the component types. Limited function results can also be passed as parameters to functions, as in: Put_Line ( "The current output file is" & Name(Current_Output) ); which calls the following functions declared in Text_IO: function Name (File: File_Type) return String; function Current_Output return File_Type; Norman ------------- !topic LSN on Limited Function Results in Ada 9X !from Ted Baker 1992-10-28 07:20:49 <> !reference MS-7.4.5();4.6 !reference 92-1594.a !reference 92-1621.a !reference 92-1626.a !reference 92-1683.a !reference 92-1684.a !reference 92-1702.a !reference 92-1704.a !reference LSN-1033 !reference LSN-1043 !discussion | This is analogy to Text_IO facilities that exist in Ada 83 and | have been found to be useful.... ....The effect of a call on | Text_IO.Set_Output persists after the end of the call. Perhaps | the body of Text_IO has a hidden variable in which it stores the | File_Type value passed to it, so it knows which file to use when | Put is called without a File_Type parameter.... This is some miscommunication here. I thought we were discussing the need for returning limited values outside the scope of their original declaration? Your original example had this property, which I questioned. The usage of Text_IO does not have this property. The only reasons I can see for File_Type ever being limited in Ada 83 are related to possible errors caused by copying an object of File_Type, based on the assumption that it is actually some kind of file control block. In this case: (1) The File_Type object might include storage for information, such as I/O buffer contents, current file position, and current access mode. (3) The File_Type object might include references to OS objects (e.g. open file descriptors) which need to be finalized when the file state changes (e.g. is closed). (2) The operating system may contain references to the File_Type object, which require finalization when the object disappears. The provision of functions that return a value of File_Type for the standard input and output files in Ada 83 is inconsistent with this rationale for limitedness, since it forces the File_Type value returned to be a reference to a file control block rather than the data structure itself. Given this, the only remaining reason for making it limited is to prevent passing such a value out of scope, creating a dangling reference problem. If we are going to allow (expect) people to do exactly this, then there was no good reason for not providing a meaningful predefined assignment and equality operations on file types in the first place. (It seems the situation is analogous to that with POSIX open file descriptors.) ------------- !topic LSN on Limited Function Results in Ada 9X !from Norman Cohen 1992-10-28 11:33:33 <> !reference MS-7.4.5();4.6 !reference 92-1594.a !reference 92-1621.a !reference 92-1626.a !reference 92-1683.a !reference 92-1684.a !reference 92-1702.a !reference 92-1704.a !reference 92-1706.a !reference LSN-1033 !reference LSN-1043 !discussion > This is some miscommunication here. I thought we were discussing > the need for returning limited values outside the scope of their > original declaration? Your original example had this property, > which I questioned. The usage of Text_IO does not have this > property. Certainly we began by discussing returning limited values outside the scope of their original declaration, but I gave some examples of functions that do this and you seemed to question why it would ever be useful to return a limited value, since that value immediately disappears. I pointed out that function calls with limited results can appear as actual parameters in calls on other subprograms, and that these other subprograms may have ways of preserving the values passed to them. My position can be summarized as follows: It is sometimes useful to return the value of a local limited object, and reasonable Ada 83 programmers may have done so. Therefore, we must not make this illegal in Ada 9X. Another (schematic) example: package Alpha_Package is type Alpha is limited private; procedure Copy (From: in Alpha; To: out Alpha); ... private ... end Alpha_Package; package Beta_Package is type Beta is limited private; procedure Copy (From: in Beta; To: out Beta); ... private ... end Beta_Package; -- Given Alpha and Beta, we are to implement Alpha_Beta: with Alpha_Package, Beta_Package; package Alpha_Beta_Package is type Alpha_Beta is limited private; function New_Alpha_Beta (Alpha_Value: Alpha; Beta_Value: Beta) return Alpha_Beta; ... private type Alpha_Beta is record Alpha_Part : Alpha; Beta_Part : Beta; end record; end Alpha_Beta_Package; package body Alpha_Beta_Package is function New_Alpha_Beta (Alpha_Value: Alpha; Beta_Value: Beta) return Alpha_Beta is Result : Alpha_Beta; begin Alpha_Package.Copy (From => Alpha_Value, To => Result.Alpha_Part); Beta_Package.Copy (From => Beta_Value, To => Result.Beta_Part); return Result; -- local limited object end New_Alpha_Beta; ... end Alpha_Beta_Package; ------------- !topic LSN on Limited Function Results in Ada 9X !from 1992-10-29 09:25:32 <> !reference MS-7.4.5();4.6 !reference 92-1594.a !reference 92-1594.a !reference 92-1609.a !reference 92-1621.a !reference 92-1626.a !reference 92-1683.a !reference 92-1684.a !reference 92-1702.a !reference 92-1704.a !reference 92-1706.a !reference 92-1708.a !reference LSN-1033 !reference LSN-1043 !discussion I agree entirely with Normal's position in (1708) and I have been personally involved in a project that contains lot's of cases similar to Alpha_Beta. With this experience in mind, I would like to respond to 92-1706.a and 92-1609.a above : > Given this, the only remaining reason for making Text_Io.File_Type > limited is to prevent passing such a value out of scope, creating a > dangling reference problem. File_Type is indeed most probably a reference to a file control block, and therefor the lifetime of the designated object is independent of the lifetime of the reference. Passing the reference out of scope is not a problem, compare to schematically : package TEXT_IO is type FILE_TYPE is access FCB; procedure OPEN ( FILE : in out FILE_TYPE ) is begin FILE := new FCB; -- Using dynamic memory, but an index into a fixed table (POSIX) -- is of course just the same case. end; procedure CLOSE ( FILE : in out FILE_TYPE ) is procedure FREE is new UNCHECKED_DEALLOCATION ( FILE ); begin FREE ( FILE ); end; end TEXT_IO; declare OUTER : TEXT_IO.FILE_TYPE; function YOUR_CONCERN return FILE_TYPE is LOCAL : TEXT_IO.FILE_TYPE; begin OPEN ( LOCAL ); return LOCAL; end; begin OUTER := YOUR_CONCERN; ... CLOSE ( OUTER ); end; Clearly this is no concern, the real concern is : declare FIRST, SECOND : TEXT_IO.FILE_TYPE; begin OPEN ( FIRST ); SECOND := FIRST; CLOSE ( FIRST ); TEXT_IO.PUT ( SECOND, ... ); -- Dangling reference access. end; > If we are going to allow (expect) people to do exactly this, then > there was no good reason for not providing a meaningful predefined > assignment and equality operations on file types in the first place. > (It seems the situation is analogous to that with POSIX open file > descriptors.) That's right, there is no *good* reason, but there is another : The POSIX_IO.FILE_DESCRIPTOR is non-limited and you can assign them freely and get unexpected behaviour in return, e.g. : declare FIRST, SECOND, THIRD : POSIX_IO.FILE_DESCRIPTOR; begin OPEN ( FIRST ); SECOND := FIRST; CLOSE ( FIRST ); OPEN ( THIRD ); TEXT_IO.PUT ( SECOND, ... ); -- This would normally write to THIRD, as it re-used the FILE_DESCRIPTOR -- used by FIRST, and freed by CLOSE. end; The "sound" way of sharing a file descriptor is the DUPLICATE function : declare FIRST, SECOND, THIRD : POSIX_IO.FILE_DESCRIPTOR; begin OPEN ( FIRST ); SECOND := DUPLICATE(FIRST); CLOSE ( FIRST ); OPEN ( THIRD ); TEXT_IO.PUT ( SECOND, ... ); -- This is guaranteed to read from the same file FIRST was open on. CLOSE ( SECOND ); -- Only now that file gets closed. end; This is of course not compatible with good Ada philosophy. If there is a "sound" way and a "tricky" way, and you can illustrate that all use of the "tricky" way can be easily re-written using the "sound" way, then the language should make the "tricky" way illegal (in the absence of "UNCHECKED" stuff). Therefor, a predefined assignment that would keep a reference count and CLOSE only when the count drops to zero would be O.K. If references that go out of scope before being passed to CLOSE do so automatically, that would be even better. Sadly enough, there is only 1 thing that prevents you (or the implementor of TEXT_IO) to make this happen. The thing is that ":=" is a bit-move in a nice gift-wrapping, you can't make it "do" anything else. You know I believe this to be a major gap in the language. It's intresting to note how the above applies equally well to a comment of Robert Eachus a few weeks ago (92-1609.a) : > Maybe a better way to resolve the incompatibility is to recognize > that in all current Ada 83 implementations there is no particular > reason for task objects to be limited objects. (As far as I know all > implementations acutally use pointers to task control blocks to > implement task objects.) If task objects in Ada 9X were not limited, > there would be no real incompatibility, and there would no longer be > any inherently limited objects in Ada 83. Perhaps even more so, as tasks are required to be implemented by references. (Assuming that was meant by LRM 9.1(1) "The value of an object of a task type *designates* a task"). Similarly to the above, one could say tasks are no longer limited, finalize as in Ada83, but the TCB cannot be re-used until all references to it (created by assignment) are gone (by finalization). Until then, the TCB is there for 'TERMINATED and TASKING_ERROR. Perhaps 1 more thing. In your original post, you mentioned : > First, the problem is avoided if functions are written initially > using access types, That in fact raises the exact same concern about dangling references. You cannot seriously consider an access type a "solution" unless you cope with the dangling reference problem, and hence with finalization. But then again, if you want your access type to be (indirectly) finalized, you need to encapsulate it is a record that is either derived from CONTROLLED or contains a mix-in. No matter what choice you take, you are back at square one, having an inherently limited, self-referential local that you have to return. Alternatively, you can deal with the dangling reference problem the classical way : shift the problem upward, and simply make the type limited. In that case you are back to functions returning limited types as well (albeit a less useful but simpler case). Stef. ------------- !topic LSN on Limited Function Return and User-defined Clone !from Bob Duff $Date: 1992-11-05 13:33:29 <> !reference MS-7.4.5();4.6 !reference LSN-1043 !discussion LSN-1043 proposed a semantics for function return of inherently limited types that some have found unpalatable. For example, some people are uncomfortable with the idea of "deferred finalization", while others are concerned about the implementation of lazy stack cutting. On the other hand, we can't avoid the fact that for some kinds of types (e.g. finalizable types), a bit-wise copy simply won't work. Disallowing function return altogether for these kinds of types would be intolerable. In this LSN, we propose instead a fix-up operation, called Clone, that happens after the bit-wise copy. The user can define Clone so that it does whatever is necessary to perform a correct copy -- allocation of storage, incrementing of reference counts, or whatever. We refer to the bit-wise copy as the "physical copy". The Clone operation is defined to happen whenever a physical copy is performed -- function return, assignment, explicit initialization, etc. Thus, the user-defined assignment that some have been hoping for is part of this proposal. The Clone always happens after the physical copy. This proposal has the advantage of removing one of the three categories of limited types. LSN-1043 proposed "limited", "inherently limited", and "inherently limited uncopyable". In this LSN, however, inherently limited and uncopyable are the same thing. If something is inherently limited, it is a run-time error to attempt to copy it (by returning a local object from a function, for example). We still think we need "limited" and "inherently limited" to be different, for upward compatibility. CONTROLLED TYPES: Change the name of type Controlled to Limited_Controlled. Add a new non-limited type called Controlled. Thus, we have: type Controlled is tagged private; procedure Initialize(Object : in out Controlled); procedure Finalize(Object : in out Controlled) is <>; procedure Clone(Object : in out Controlled) is <>; type Limited_Controlled is tagged limited private; procedure Initialize(Object : in out Controlled); procedure Finalize(Object : in out Controlled) is <>; A descendant of Controlled or Limited_Controlled is called a "controlled type". We say "limited controlled type" or "non-limited controlled type" when the difference matters. The basic idea is that after a value is physically copied into an object, the Clone operation is automatically applied to that object. Any subcomponents of the object are also Clone-d. The Clone operations are always performed in bottom-up order -- the components first, then the whole object. Initialize and Finalize behave as explained elsewhere. In particular, Initialize is done bottom-up (first components, then the whole object). Finalize is done top-down (first the whole object, then the components). PARAMETER PASSING: All inherently limited types are passed by reference. All controlled types are passed by reference. FUNCTION RETURN: The semantics of function return is: For an inherently limited type (recall that objects and values go together for these types, so it makes sense to talk about the "result object"): If the result object is local, raise Program_Error. If the result object is global, return it by reference. For a type that is not inherently limited: Create a new object (the "result temp"). Physically copy the function result into the result temp. Perform the Clone operation on any parts of the result temp that are controlled. (After this, the function will be left, which will cause the normal finalization of locals, including, perhaps the object from which the result originated.) As stated above, returning a local inherently limited object will raise Program_Error. This check is not hard to perform in most implementations -- if the stack is a contiguous block of memory (the usual implementation), then the code simply tests whether the address of the result is between the old and new stack pointers. For a secondary stack model, it might be necessary to check both stacks. For a model that allocates stacks in chunks, a more complicated check is necessary. The result temp created in the non-inherently-limited case is finalized at some point after the caller is finished with it -- no later than the declaration or statement containing the function call. Note that Limited_Controlled is inherently limited; Controlled is not. ASSIGNMENT STATEMENT: The semantics of the assignment statement are: Evaluate the left-hand-side name and the right-hand-side expression, and perform constraint checks, all as in Ada 83. Finalize all controlled parts of the left-hand side. Physically copy the right-hand side into the left-hand side. Perform the Clone operation on any parts of the left-hand side object that are controlled. The implementation is allowed to go through an intermediate temp object, doing Clone-s and Finalize-s as necessary. It must use the intermediate temp object approach if any parts of the left- and right-hand sides overlap, as in "X.all := Y.all;" (where X and Y point to the same object) or "S(1..10) := S(2..11);". INITIALIZATION: Suppose we have: X: T; -- default initialization Y: T := F(...); -- explicit initialization The semantics of the default initialization of X are: Create the object, as in Ada 83. Perform default initialization, as in Ada 83. Perform the Initialize operation on any controlled parts. The semantics of the explicit initialization of Y are: Evaluate the initialization expression, create the object, and perform constraint checks, all as in Ada 83. Physically copy the value of the expression into the new object. Perform the Clone operation on any parts of the object that are controlled. OTHER COPY OPERATIONS: The other operations that involve copying are: - array catenation (the "&" operator) - default initialization of record components - passing of a generic formal parameter of mode 'in', either explicitly or by default. - an extension aggregate (Note that aggregate formation is not listed above -- aggregates are illegal for controlled types, since they are extensions of a private root type.) In each of the above operations, a new object (or group of objects) is being created. Thus, the rules can follow those defined above under INITIALIZATION. For example, for array catenation, each component of the result is Clone-d. For an extension aggregate, the parent part is Clone-d, and each controlled extension component is Clone-d; the extension aggregate as a whole is not Clone-d, since it hasn't been physically copied. The Clone operations always follow the same pattern -- first make the physical copy, then do Clone. The assignment statement is the only operation involving copying that needs to first Finalize the target object; in all the other cases, the target object is uninitialized. Note that we can't disallow any of the above operations without causing generic contract model problems, because non-limited controlled types can be passed to a generic formal (non-limited) private type. ABORT AND EXCEPTIONS: Abort is deferred during the groups of Finalize/Clone operations mentioned above. Abort is also deferred during each Initialize operation. This helps ensure that the operations do not happen too few or too many times. For example, for the assignment statement, if there are any controlled parts, abort is deferred from the first Finalize through the last Clone. The expression evaluations should happen outside the abort-deferred region. Similarly, if an exception occurs during a Finalize or Clone, other pending Finalize or Clone operations are done anyway. When they are all done, Program_Error is raised. This is just like in previous proposals for finalization. EFFICIENCY: For efficiency, we give special permission to the implementation to remove groups of Clone, Finalize, Physical-Copy operations that are in some sense redundant. The exact rules are not shown here. This requires the programmer to take care that these operations work properly under these terms. For example, if Finalize deallocates memory, and Clone allocates memory, this will be true. Storage_Error might be avoided in certain situations where it would otherwise be raised, but Storage_Error is unpredictable anyway. Similarly, if the Clone and Finalize operations keep reference counts, then (barring overflow) the reference counts will be correct, even if the compiler decides to remove certain groups of operations. For example, the statement "X.all := Y.all;" doesn't need to do anything if X = Y. Here's another example: X, Y: T := ...; function F return T is begin return X; end T; Y := F; The assignment statement "Y := F;" involves several physical copies, clones, and finalizations. However, if function F is inlined, the compiler might notice that most are redundant. It might generate code identical to that of "Y := X;". FINALIZATION OF TEMPORARY OBJECTS: The above assumes that temporary objects are created at certain points. These temps are finalized after they are no longer needed, and before the end of the current declaration or statement. However, the temp and its finalization are avoided if the compiler takes advantage of the permission to avoid redundant operations. For example: X := (A, Controlled_Object); For the assignment statement, the normal semantics would be to build the aggregate in a temporary object, Clone the controlled component, finalize the left-hand side, physically copy the aggregate into the X, and then Clone X. Alternatively, (assuming Controlled_Object is not part of X), the compiler is allowed to finalize X, build the aggregate in X, and Clone the controlled component of X. INTERACTIONS WITH VOLATILE AND ATOMIC: One issue that has been open for some time is the handling of volatile and atomic parameters. For example, suppose a composite object is declared Volatile (via the pragma, of course). If it is passed as a parameter, is the parameter still volatile? The subprogram doesn't know that the actual is volatile. If the parameter is passed by copy, there's no problem -- just the copy in and copy out operations, which generally happen at the call site, need to know about the volatility. But what if it's passed by reference? Furthermore, for a volatile object, the programmer needs to know when reads and writes are happening -- the maybe-by-copy, maybe-by-reference semantics doesn't work very well for such low-level programming. Given the above proposal, we add the following rules about atomic and volatile: Pragma Atomic is only allowed for an elementary object (either declared by an object_declaration, or by a component_declaration). If the object is a component, the containing type must be inherently limited. Pragma Volatile is allowed for objects as for Atomic. In addition, pragma Volatile is allowed on the declaration of an inherently limited type. (See below for a possible relaxation of this restriction to inherently limited types.) Thus, when passing an atomic or volatile object, the programmer has full knowledge of the parameter passing mechanism. If the parameter is elementary, it is passed by copy; the called subprogram works with a separate object. If the parameter is not elementary, the above rules ensure that it is inherently limited, and thus passed and returned by reference. In that case, the compiler can know that a particular component of the parameter is atomic or volatile, or that the parameter as a whole is volatile. Task and protected types are always volatile -- no need for pragma Volatile on them (although it is presumably legal). INTERACTIONS WITH ALIASED: We are considering requiring that composite objects with aliased components be passed by reference. We might do that by making aliased components cause the containing type to become inherently limited, or we might disallow aliased components of non-inherently limited types. Or, we might simply say that they are passed by reference, without requiring inherent limitedness. If we choose the latter approach, we might consider choosing a similar approach for the volatile case: Allow pragma Volatile on any declaration of a type that is not required to be passed by copy. Once marked volatile, all parameters of the type (or with a subcomponent of the type) must be passed by reference. SUMMARY: This LSN proposes what we believe to be a semantically simpler and more powerful approach to limited types than proposed by LSN-1043. Function return of finalizable types is possible. There is no need for deferred finalization or lazy stack cutting. User-defined assignment and initialization are supported. ------------- !topic LSN on Limited Function Return and User-defined Clone !from R.R. Software (Randy Brukardt) 1992-11-09 19:52:04 <> !reference MS-7.4.5();4.6 !reference LSN-1043 !reference LSN-1056 !discussion This LSN proposes an interesting solution to various earlier problems in Ada 9x, and at the same time adds greater functionality. I have a couple of questions and comments. When I first heard about 'Clone', it didn't make much sense. Even the first reading of the LSN did not help. The problem mainly boiled down to not quite understanding (in a language lawyer sense) what the 'physical' copy was. Finally, I realized that the 'physical' copy is exactly the same assignment as exists in Ada 83; the confusion was that in earlier versions we had been talking about limited types, and that is no longer true. Aggregates: Bob says: "(Note that aggregate formation is not listed above -- aggregates are illegal for controlled types, since they are extensions of a private root type.)". I assume he does not mean that there is anything wrong with the use of a non-limited controlled type as an aggregate component of some other type. That case should be on his list, but it does not bring any new problems to the party. Run-time check for returning a local object: Why is this a RUN-TIME check? Isn't it possible to detect this at compile-time? Didn't we once have rules for doing so in an earlier version of Ada 9x? Given that the general philosophy of Ada 9x is to perform as many checks as possible at compile-time, it seems odd to be adding a new run-time check. Result temps: Finalization of 'result temps' created by function returns are mentioned as being defined as occurring at some later point, not later than the end of the statement or declaration. A couple of questions here: I assume that is the innermost statement or declaration? (An expression can be in several compound statements at once due to nesting). I expect that this rule will also extend to the 'temps' created to hold the result of extension aggregate creation and catenation operators. Otherwise, they would fall through the cracks. The LSN did not really go into what problems it solves, and which it leaves. I have therefore tried to make a list of issues that have been brought up in this regard. Pros: Full composition of user-defined assignment. Finalization does not require limited types. Function return rules do not require 'deferred finalization'. Bevin Brett's problems with deferred finalization of tasks cannot happen. Cons: Not as strong as a full user-defined assignment; constraint checks are still defined by the language. (For example, a string package cannot allow objects with different discriminants to be assigned). All finalizable types must be defined at the root of the hierarchy, and therefore at the library level. Finalization is still fine-grained, thus making a PC-map approach difficult. All in all, this seems like a net win. I don't see any particular implementation problems, except for catenation. Catenation currently does not have to call a user-defined routine, and I know the way it is generated in our compiler, it cannot call one. So that might cause some problems - but nothing real severe. (The only language solution to this problem is to disallow the use of catenation inside of generics when the component type is a generic formal, but this would cause a compatibility problem -- and in this case I know that it has been used, as we have had a bug report on that particular construct). Randy. ------------- !topic LSN on Limited Function Return and User-defined Clone !from Bob Duff 1992-11-09 20:09:15 <> !reference MS-7.4.5();4.6 !reference 92-1760.a !reference LSN-1043 !reference LSN-1056 !discussion > Run-time check for returning a local object: > > Why is this a RUN-TIME check? Isn't it possible to detect this at compile-time? Unfortunately not. Consider, for example, a function F that says, "return G(Local);", where Local is local to F. G might return the parameter, thus causing the run-time error, or not. > Didn't we once have rules for doing so in an earlier version of Ada 9x? Yes, but I think those checks were too pessimistic. They were not upward compatible. > Given that the general philosophy of Ada 9x is to perform as many checks as > possible at compile-time, it seems odd to be adding a new run-time check. > > Result temps: > > Finalization of 'result temps' created by function returns are mentioned as > being defined as occurring at some later point, not later than the end of the > statement or declaration. A couple of questions here: I assume that is the > innermost statement or declaration? (An expression can be in several > compound statements at once due to nesting). Yes, the intent was the innermost one. > I expect that this rule will also extend to the 'temps' created to hold the > result of extension aggregate creation and catenation operators. Otherwise, > they would fall through the cracks. Yes. > The LSN did not really go into what problems it solves, and which it leaves. > I have therefore tried to make a list of issues that have been brought up in > this regard. > > > Pros: > Full composition of user-defined assignment. > > Finalization does not require limited types. > > Function return rules do not require 'deferred finalization'. > Bevin Brett's problems with deferred finalization of tasks cannot happen. > > Cons: > Not as strong as a full user-defined assignment; constraint checks are > still defined by the language. (For example, a string package cannot > allow objects with different discriminants to be assigned). > > All finalizable types must be defined at the root of the hierarchy, and > therefore at the library level. > > Finalization is still fine-grained, thus making a PC-map approach difficult. All correct. I think the PC-map based approach is feasible, but the linked approach is simpler. I recommend the linked approach, but there are those who prefer saving the cost of the links. Either approach is allowed, of course. > All in all, this seems like a net win. I don't see any particular > implementation problems, except for catenation. Catenation currently does not > have to call a user-defined routine, and I know the way it is generated in our > compiler, it cannot call one. So that might cause some problems - but nothing > real severe. (The only language solution to this problem is to disallow the > use of catenation inside of generics when the component type is a generic > formal, but this would cause a compatibility problem -- and in this case I > know that it has been used, as we have had a bug report on that particular > construct). > Randy. - Bob ------------- !topic LSN on User-defined Initialization/Finalization/Assignment !from Bob Duff $Date: 1992-12-16 15:56:22 <> !reference MS-7.4.5();4.6 !reference LSN-1056 !reference LSN-1059 !discussion This Language Study Note is an update to LSN-1056 on Limited Function Return and User-defined Clone. At the November ISO/WG9 meeting in Salem, the approach of LSN-1056 was tentatively approved (unanimously). Since the idea was so new, the MRT was directed (by resolution number 10-6) to study its impact on the rest of the language, and on implementations. This LSN is in response to that resolution. Our study of the implementation difficulty of the (non-limited) Controlled type produced no surprises. The feature is definitely non-trivial to implement, but, as explained in LSN-1056, that difficulty is somewhat offset by the reduced complexity of limited function return. From a semantic point of view, the language as a whole is simpler without the special limited function return semantics previously proposed. Finally, the small net increase in implementation difficulty buys us a great deal of benefit in terms of usability. We also requested that the implementer halves of the U/I teams study the implementation issues. RR Software studied the issue in detail. No great implementation difficulties were reported. RR Software has stated that they believe the proposal to be a net win. On the subject of limited function return, one glaring error in LSN-1056 should be pointed out: It states that the implementation can easily detect limited function return as a run-time error. This is incorrect. (Thanks to Brian Dobbing for pointing this out.) Consider, for example, a function that returns X.all, where X might point into a user-defined storage pool. The compiler cannot tell which storage pool, in the general case, and storage pools are not required to allocate their storage from any particular place. Therefore, the compiler cannot determine accessibility by inspecting the address of the returned object. Therefore, we have decided to change the rule to this: For a function whose result type is inherently limited, the expression of the return statement must be the name of a global object, a call to a global function, a dereference of an access parameter, or a parenthesized expression, qualified expression, or type conversion of one these things. This detects most of the problems at compile time. In addition, a run-time check is made that the returned object is dynamically accessible from where the function is declared. (The run-time check is needed for access parameters and to avoid contract model problems in generics.) These rules allow the caller of a function with an inherently limited result type to assume that the result will be accessible from where the called function is declared. As stated in the LSN, inherently limited objects are always passed by reference, and returned from functions by reference. The above rules ensure that this is possible without the "lazy stack cutting" of previous proposals. It has the side benefit of turning Ada 83's erroneous (by ARG fiat) local task return into a compile-time error in most cases, and a run-time error in the others. This rule is not quite as simple as was (incorrectly) suggested by LSN-1056. However, it is reasonably simple, and does not require the special semantics of inherently limited functions that had been proposed previously. The reasoning of LSN-1056 still applies: Since Controlled is non-limited, the ability to return local inherently limited values is much less important, so we can live with the above easily-implemented rule for limited function return. The implementation cost then moves to the support for user-defined Clone, which is necessary to make a non-limited Controlled type work properly. We have not discovered any major new semantic interactions between the Clone proposal and other features of the language. The interaction with Volatile, Atomic, and aliased objects is as stated in LSN-1056. The Root_Storage_Pool type, which supports user-defined storage pools, needs to be derived from Limited_Controlled, not from Controlled. (To allow copying of storage pools would wreak havoc; that type definitely wants to be limited!) One issue raised at the Salem meeting was the effect on the efficiency of generic code sharing. As we stated at the meeting: - If the compiler implements universal sharing, then it already needs to pass some sort of thunk for assignment, which knows about copy sizes, and necessary constraint checks. The efficiency of existing actual thunks will not be affected by the possibility of new thunks containing some user-defined code. - If the compiler implements partial sharing, it is already making some decision about which instances are similar enough to share. (For example, if there is a formal integer type, it might create one copy for each different integer size supported by the hardware.) The implementation will probably want to enhance the decision, so that if the actual has a user-defined Clone, a different copy of the code is used than if the actual does not. One other issue that was raised at the Salem meeting is this: Now that we have this fancy new capability, how should it be used in language-defined library units? Our plan is as follows: - We decided not to make type Text_IO.File_Type a non-limited controlled type, due to possible implementation complexity. However, implementations might find it useful to derive it (internally) from Limited_Controlled in order to clean up, as allowed by AI-00546. - The string handling package (which used to be part of the IS Annex, but is being moved to the core, as part of the predefined environment), will not support a true dynamic string data type based on Controlled. This decision is based on schedule concerns; we don't see any compelling reason to have such support, and deciding exactly which operations to support and so on is not easy. Instead, it will continue to support the previously-proposed VString type that can vary in length up to a maximum size. Users can define such packages if they so desire. If we finish everything else early, we can always change our mind. :-) - In the interface-to-C package, there will be a nul-terminated string data type, with suitable conversion operations. We considered basing this on Controlled, in order to ease the heap management. However, we rejected that idea, because the C string data type needs to be at a lower level of abstraction. Its representation needs to match that of C, not that chosen for type Controlled (a tagged type). Users should have full control over storage management for this type. - It has been pointed out that type Task_ID could be a controlled type in order to facilitate checking for dangling Task_ID's. (See also LSN-1059.) This is true. However, it is really an implementation technique, and not a language issue. Therefore, we see no need to mention it in the new RM. Other techniques are also possible, of course. SUMMARY: Our proposal for user-defined assignment, based on a non-limited Controlled type with a primitive Clone operation, stands. We have found no serious problems with it. The limited function return proposal of LSN-1056 was incorrect, and is modified as outlined above. The current proposals on these subjects are not trivial to implement, but neither are they extremely difficult. We believe the cost-benefit tradeoff makes these proposals preferable to any previous ones. Since many users have been clamouring for user-defined assignment for some time, and since user-defined assignment is fully supported by some of our competitor languages, it is a relief to be able to provide it with a semantically clean and simple proposal. ------------- !topic LSN-1064 on User-defined Initialization/Finalization/Assignment !from Gary Dismukes 1992-12-17 21:16:48 <> !reference MS-7.4.5();4.6 !reference LSN-1046 !reference LSN-1056 !reference LSN-1064 !discussion Here are some comments on the issues discussed in the "clone-related" LSNs based on the TeleSoft Implementer Team's review. These comments are primarily supportive of the MRT's conclusions, but also include some questions on intended semantics. Apologies for the delay in providing this feedback. First off, there seems to be general agreement that the implementation of support for non-limited controlled types will be a large effort, although a fair amount of the complexity was already present for the limited controlled type proposal. This seems to be a case, though, where the user desire is great enough (and loud enough) that we are willing to pay the high implementation cost. Now some specific questions and comments: The Clone Procedure ------------------- First off, I'm not sure I have ever seen a stated rationale for preferring to have a user-defined "clone" procedure rather than a user-defined "copy" function. LSN-1046, which discussed alternative models for finalization, talked in terms of user-defined copy and LSN-1056 introduced clone for the first time I believe (apart from some spoken references to clone in various meetings). It seems that clone might be preferred because it avoids questions of whether to perform the operation in contexts such as parameter passing and of how to define the semantics of the return statement for a user-defined copy function. I guess there is also an efficiency issue since clone can operate on its argument in place, whereas as a copy function will typically involve an extra copy to be made of its argument (in the absence of fancy optimizations). Sorry if I have forgotten or just missed a discussion of the rationale somewhere along the line, but I would welcome a (re)statement of this. One comment on the choice of the name "clone": I don't really care for it, because it seems kind of slangy. Unfortunately I haven't been able to come up with a better name. Names like "copy" or "replicate" all give the wrong impression since the copy has already been made and what we are doing is generating a new version of the object based on the original value (yeah, yeah, kind of like a clone ;-). Were any other names considered? In LSN-1046 in the last paragraph under discussion of Issue 3 ("Is User-defined Copy Allowed?") it was stated that: "If user-defined copy is allowed, it should not be applied when inside the immediate scope of the type itself. Or perhaps for operations that are declared in that immediate scope. Otherwise, how can one define the copy operation in terms of assignment?" Is this restriction also intended to apply to the semantics of Clone (and Initialize/Finalize)? That is, if Clone performs an assignment to its argument as a whole, is this performed using "predefined" assignment only, or does it invoke the Clone operation recursively? Perhaps this is seen as a non-issue for Clone, etc. and was only considered a problem for a user-defined copy function because of the need to use a return statement. It seems to me that even within a package defining the controlled type it's important for any operations on objects of the type to behave according to the normal defined semantics for Clone/Init/Finalize. With this viewpoint, the designer of the controlled type abstraction must take responsibility for avoiding any use of the type within one of the special controlled operations that would result in unbounded recursive invocation, which seems acceptable. Controlled Operations Applied to Parent Parts --------------------------------------------- Another case where the controlled type designer must be careful is when providing overriding implementations of the special operations. In this case I think that the Mapping Team has already made an explicit statement that it's the programmer's responsibility to explicitly invoke the parent type's corresponding operation in the implemention of the overriding version. This design choice concerns me a little, even though it clearly allows greater flexibility, because I think it will be a common pitfall (i.e., forgetting to invoke the parent operation when needed). This mistake can potentially lead to subtle breakages of the parent abstraction that may be difficult to track down (e.g., the parent part of an object being default initialized but not fully initialized). However, the designer of a controlled type abstraction must clearly be reasonably sophisticated, so maybe this is not a serious concern. Inherently Limited Function Return ---------------------------------- We're glad to see that the rules have been changed (in LSN-1064) regarding the check for returning of a local inherently limited object. I had noticed that, for LSN-1056, implementation of the run-time check would be a problem for the case of local access collections where the objects are freely allocated on the heap rather than in a contiguous storage area. I'm a little concerned, though, that the new legality rules may be too strict. For example, it would now be illegal to have a function that takes an array of inherently limited components and returns one of the components. There are probably reasonable workarounds in most cases though, and given that this only applies to inherently limited types maybe it's not too bad. (But it wouldn't surprise me if there are Ada 83 programs using tasks that could run afoul of this incompatible restriction.) Assignment Statement -------------------- I'm not sure I like the implementation-defined nature of the use of compiler temporaries for assignment and the like. With the implicit invocations of various controlled operations for copying and initialization operations we already seem to be moving away from WYSIWYG and the possibility of multiple invocations of user-defined depending on differences in compilers is starting to feel distinctly uncomfortable. For some cases, such as potential identity assignments (e.g., X.all := Y.all) it would seem preferable to require compilers to check for identity and do nothing if the objects are the same, or alternatively require the use of a temporary. For the case of overlapping slice assignments, which many compilers already treat specially to avoid extra temporary copies, you could require that compilers perform assignments involving finalization/clone as a component-by-component assignment (in the appropriate direction) doing clones and finalizes as appropriate on each component. I realize that there are some interactions here with optimization, and it may be considered overspecification to require specific behavior in this area, but the statements in LSN-1056 that "The exact rules are not shown here" and "This requires the programmer to take care that these operations work properly under these terms" seem to indicate a certain vagueness of semantics. I'd be interested in seeing how the "exact rules" will be specified. Initialization -------------- If a constant of a type with user-defined Clone is declared, then Clone will effectively be passed the constant as if it were a variable (since its parameter is of mode in out). I don't see any problem with this from an operational point of view, but it's a little odd from a semantic consistency perspective. I guess you can somehow describe the constant as being treated as a variable until it has been fully initialized, at which point it becomes a full-fledged constant. There are of course a few other cases where some semantic finessing is also needed, such as aggregate component initialization where (in Ada 83) there is not even the notion of an object existing. Overall, we think the proposal is workable and implementable, but it's definitely not going to be easy to get all of this working right. The need to support composition of controlled operations and the interactions with abort and exceptions will all involve considerable work. However, given that we have already effectively bought into the complexity of finalization, the extra capability afforded by non-limited controlled types seems worthwhile. ------------- !topic LSN-1064 on User-defined Initialization/Finalization/Assignment !from Bob Duff 1992-12-30 17:20:15 <> !reference MS-7.4.5();4.6 !reference 92-1889.a !reference LSN-1046 !reference LSN-1056 !reference LSN-1064 !discussion Gary, > Here are some comments on the issues discussed in the "clone-related" > LSNs based on the TeleSoft Implementer Team's review. These comments > are primarily supportive of the MRT's conclusions, but also include > some questions on intended semantics. Apologies for the delay in > providing this feedback. > > First off, there seems to be general agreement that the implementation > of support for non-limited controlled types will be a large effort, > although a fair amount of the complexity was already present for the > limited controlled type proposal. This seems to be a case, though, > where the user desire is great enough (and loud enough) that we are > willing to pay the high implementation cost. > > Now some specific questions and comments: > > The Clone Procedure > ------------------- > > First off, I'm not sure I have ever seen a stated rationale for > preferring to have a user-defined "clone" procedure rather than a > user-defined "copy" function. LSN-1046, which discussed alternative > models for finalization, talked in terms of user-defined copy and > LSN-1056 introduced clone for the first time I believe (apart from > some spoken references to clone in various meetings). It seems that > clone might be preferred because it avoids questions of whether to > perform the operation in contexts such as parameter passing and of how > to define the semantics of the return statement for a user-defined > copy function. I guess there is also an efficiency issue since clone > can operate on its argument in place, whereas as a copy function will > typically involve an extra copy to be made of its argument (in the > absence of fancy optimizations). Sorry if I have forgotten or just > missed a discussion of the rationale somewhere along the line, but > I would welcome a (re)statement of this. The semantics of Clone just seem simpler than the semantics of a Copy function. As you mention, if function return calls Copy, then the semantics of the return statement inside Copy itself causes trouble. > One comment on the choice of the name "clone": I don't really care > for it, because it seems kind of slangy. Unfortunately I haven't been > able to come up with a better name. Names like "copy" or "replicate" > all give the wrong impression since the copy has already been made > and what we are doing is generating a new version of the object based > on the original value (yeah, yeah, kind of like a clone ;-). Were > any other names considered? The name "Clone" was Tucker's idea, I think, and I never questioned it. I don't really see a problem with it, and I can't think of anything else that's as good. Suggestions are welcome, of course. > In LSN-1046 in the last paragraph under discussion of Issue 3 > ("Is User-defined Copy Allowed?") it was stated that: > > "If user-defined copy is allowed, it should not be applied when > inside the immediate scope of the type itself. Or perhaps for > operations that are declared in that immediate scope. Otherwise, > how can one define the copy operation in terms of assignment?" > > Is this restriction also intended to apply to the semantics of Clone > (and Initialize/Finalize)? No. The notion that certain things don't happen within the immediate scope of the type was really necessary when we were thinking about finalization and user-defined assignment of all types. Limiting it to composite types removes the need for that complexity. > ...That is, if Clone performs an assignment > to its argument as a whole, is this performed using "predefined" > assignment only, or does it invoke the Clone operation recursively? Clone is invoked recursively in such a case. This is similar to the fact that if Finalize declares a variable of the type in question, it will implicitly call itself recursively before returning. The programmer has to be aware of this. Since all controlled types are composite, one can do component-by-component operations in Initialize, Finalize, and Clone. Note also that aggregates are illegal, since Controlled is private. > Perhaps this is seen as a non-issue for Clone, etc. and was only > considered a problem for a user-defined copy function because > of the need to use a return statement. It seems to me that > even within a package defining the controlled type it's important > for any operations on objects of the type to behave according to > the normal defined semantics for Clone/Init/Finalize. With this > viewpoint, the designer of the controlled type abstraction must > take responsibility for avoiding any use of the type within > one of the special controlled operations that would result in > unbounded recursive invocation, which seems acceptable. Right. > Controlled Operations Applied to Parent Parts > --------------------------------------------- > > Another case where the controlled type designer must be careful is > when providing overriding implementations of the special operations. > In this case I think that the Mapping Team has already made an > explicit statement that it's the programmer's responsibility to > explicitly invoke the parent type's corresponding operation in the > implemention of the overriding version. This design choice concerns > me a little, even though it clearly allows greater flexibility, > because I think it will be a common pitfall (i.e., forgetting to > invoke the parent operation when needed). This mistake can > potentially lead to subtle breakages of the parent abstraction that > may be difficult to track down (e.g., the parent part of an object > being default initialized but not fully initialized). However, the > designer of a controlled type abstraction must clearly be reasonably > sophisticated, so maybe this is not a serious concern. This is a possible pitfall. I don't see any way around it. Anyway, it's really a general problem with OOP in most languages -- there are many times when an overriding operation must invoke the parent operation in order to work properly. If the programmer forgets to do it, it's a bug. Lisp Flavors and CLOS partly solve the problem by providing several different ways of overriding: pre- and post- methods, wrappers, whoppers, &c. But most OOP languages ignore the problem, and go for a simple model. I think that's best for Ada 9X, too. > Inherently Limited Function Return > ---------------------------------- > > We're glad to see that the rules have been changed (in LSN-1064) > regarding the check for returning of a local inherently limited > object. I had noticed that, for LSN-1056, implementation of the > run-time check would be a problem for the case of local access > collections where the objects are freely allocated on the heap rather > than in a contiguous storage area. I'm a little concerned, though, > that the new legality rules may be too strict. For example, it would > now be illegal to have a function that takes an array of inherently > limited components and returns one of the components. There are > probably reasonable workarounds in most cases though, and given that > this only applies to inherently limited types maybe it's not too bad. > (But it wouldn't surprise me if there are Ada 83 programs using tasks > that could run afoul of this incompatible restriction.) We think such cases won't come up much in practise. The alternative is to have run-time checks. The only "simple" method we could come up with involved passing the frame pointer of inherently limited things around, and doing a search up the stacks of the current task, plus containing tasks, for that frame, on each return statement. That's doesn't qualify as simple in my book, and it is clearly unacceptably inefficient (it's not even a bounded-time check). If you can think of a simple and efficient algorithm for doing the check at run time, let us know. > Assignment Statement > -------------------- > > I'm not sure I like the implementation-defined nature of the use of > compiler temporaries for assignment and the like. With the implicit > invocations of various controlled operations for copying and > initialization operations we already seem to be moving away from > WYSIWYG and the possibility of multiple invocations of user-defined > depending on differences in compilers is starting to feel distinctly > uncomfortable. For some cases, such as potential identity assignments > (e.g., X.all := Y.all) it would seem preferable to require compilers > to check for identity and do nothing if the objects are the same, or > alternatively require the use of a temporary. For the case of > overlapping slice assignments, which many compilers already treat > specially to avoid extra temporary copies, you could require that > compilers perform assignments involving finalization/clone as a > component-by-component assignment (in the appropriate direction) doing > clones and finalizes as appropriate on each component. I realize that > there are some interactions here with optimization, and it may be > considered overspecification to require specific behavior in this > area, but the statements in LSN-1056 that "The exact rules are not > shown here" and "This requires the programmer to take care that these > operations work properly under these terms" seem to indicate a certain > vagueness of semantics. I'd be interested in seeing how the "exact > rules" will be specified. OK, we'll consider your suggestions. The ILS will, of course, have to have the exact semantics. (I'm not sure it made it into version 1.2, though.) I agree that the implementation-definedness is a concern. But, on the other hand, it may yield better efficiency. And it's not so bad for the programmer. Most of the algorithms we're talking about involve resource management -- reference counts, grabbing and releasing database locks, and so on. It is usually easy for the programmer to prove that extra (properly-matched) pairs of operations do no harm. For example, if I'm doing reference counting, why should I care if there's an extra increment/decrement pair (or a missing pair)? Anyway, we're working on this issue. > Initialization > -------------- > > If a constant of a type with user-defined Clone is declared, then > Clone will effectively be passed the constant as if it were a variable > (since its parameter is of mode in out). I don't see any problem > with this from an operational point of view, but it's a little odd > from a semantic consistency perspective. I guess you can somehow > describe the constant as being treated as a variable until it has been > fully initialized, at which point it becomes a full-fledged constant. > There are of course a few other cases where some semantic finessing is > also needed, such as aggregate component initialization where (in Ada > 83) there is not even the notion of an object existing. In Ada 83 a constant isn't really constant until (default) initialization has happened. In Ada 9X, we need to extend this to the Initialize and Finalize operations. So, a constant is unchanging from just after it is initialized until just before it is finalized. I see no semantic difficulty with that. > Overall, we think the proposal is workable and implementable, but > it's definitely not going to be easy to get all of this working right. > The need to support composition of controlled operations and the > interactions with abort and exceptions will all involve considerable > work. However, given that we have already effectively bought into the > complexity of finalization, the extra capability afforded by > non-limited controlled types seems worthwhile. Agreed. - Bob ------------- !topic Access discriminants during returns of limited values !from Bevin Brett 1992-06-03 16:05:38 <> !reference MS-7.4.6();4.6 !keywords Limited, Access, !discussion I assume the restriction of TYPE_NAME'ACCESS to tagged extensions is to avoid fun during pass-by-copy. Never-the-less the trouble has not been completely suppressed... [ps: it is unclear why the restriction to tagged EXTENSIONS as opposed to any tagged types. The following example shows a silly way around this restriction]. package PKG is type T_DONT_USE_ME is limited tagged record null; end record; type T; type SECOND(D : access T) is new System.Controlled with record ... end record; type T is new T_DONT_USE_ME with record C : SECOND(T1'access); end record; function F return T; end; package body F is function F return T is begin loop declare Ts : array(1..1000000) of T; begin if Random then return Ts(Random); end if; end; end loop; end; end; It is going to be VERY hard to find all the access-discriminants inside Ts(Random) and make them point at the appropriate new copy of T that has been moved outside F's lifetime into the place where its lifetime is long enough... ------------- !topic Extensions of SYSTEM.CONTROLLED is subject to scope level checks !from Bevin Brett 1992-06-04 10:27:14 <> !reference MS-7.4.6();4.6 !keywords finalize, initialize, controlled !discussion It is VERY unfortunate that the scope level rules mean that no finalized type can be declared in a task or subprogram. [My earlier suggestion to stop using derivation for extension, and allow derivation at different scope levels would get nicely around this]. ------------- !topic Finalization Statements !from Brian Dobbing 1992-08-03 15:01:51 <> !reference MS-7.4.6();4.6 !discussion There does not appear to be any restriction on the content of the statements in a FINALIZE procedure; in particular, in the case of finalization being invoked as part of task abort, the abnormal task is still allowed to perform all tasking operations, unlike RM 9.10(6), including task creation, PR and normal entry calls, selects, delays, abort etc. Is this really intended? If so, it means that: (a) The aborted task must execute its own finalization code, rather than it being possible for the aborter to do the cleanups on its behalf. This seems to run against the notion that an aborted task is a rogue task and should not execute any more (ie. the immediate abort requirement) or at the very least should not be allowed to alter the general state of the program (by task creation/abort/rendezvous). Indeed it may not be possible for the aborted task to run anymore because it has been aborted for the very reason that it has destroyed part of its execution environment. (b) The aborted task may "hang around" indefinitely, eg. suppose it creates a dependent task which also becomes rogue. Since the aborted task is protected from abort during finalization, there is no way for the rest of the program to issue another "bang-bang you're REALLY dead this time" to kill the aborted task AND all its dependents like one got in Ada83. Given that the aborter cannot execute the calls to the finalization routines, can the MRT explain what the simplication is in the note at the end of MS7.4;4.6 with respect to all the finalization routines being at the outermost scope level? -- Brian ------------- !topic Finalization and Limited Function Results !from Stef Van Vlierberghe 1992-10-15 23:02:20 <> !reference MS-7.4.6();4.6 !reference LSN-1033 !reference LSN-1043 !discussion > Proposal A is far too complex, and introduces too many semantic > anomolies. (What, for example, happens if a discriminant is of a > finalizable type? Discriminants are not passed by reference, > surely.) This is not fair. We are supposed to accept the restriction that finalization can be used only for derivatives of a specific limited tagged type, while the alternatives are implicitly supposed to support any type, including scalar types (which *even* C++ doesn't allow you to fiddle with). Proposal A is user-defined assignment and finalization, definitely restricted to non-scalar types. This is again supporting the (IMHO false) assumption that the MS;2.0 proposal cannot be tailored down to some reasonable complexity size. Being pragmatic, I wouldn't really mind if it were restricted to non-tagged private types completed with non-limited constrained record types. Non-limited meaning user-defined assignment isn't a tool to remove the property of limitedness. > Some inherently limited objects cannot be moved. An object cannot be > moved if it might contain references to itself, or to other objects > in the same declarative region. It is a bounded error to move such > objects. User-defined assignment could do the copy correctly, permit to pop the stack on function return, and make this non-movable inherently limited object (poor sod) a true (movable, non-limited) first-class citizen. > We should make it as painless as possible to modify one's program, > for example, to add finalization, or to add some other property > that necessitates inherent limitedness. Requiring the programmer > to change all function calls into procedure calls would be an > onerous burden. I consider this a good argument against coupling finalization with limited- and tagged-ness, which is an onerous burdon too. But as you said, non-limited finalization implies, yes... > Thus, we believe that inherently limited objects should be > first-class citizens -- in particular, they should be allowed as > function result types. We believe we have achieved a reasonably > simple and implementable semantics for them, as described above. When facing seconds-class citizens (such as unconstrained types, class-wide types, and inherently limited types) the application programmer can always turn them into first-class citizens, using some kind of reference, typically an access type. Currently this is a false argument, the reference must get user-defined finalization, and therefor it gets limited and tagged, and is no longer first-class itself. Alternatively, the reference could get user-defined assignment and finalization, not become limited or tagged, and be first-class. In that case most programmers would stop asking for varying strings, class-wide objects that can change class, garbage collection, and the magical function return of unmovable objects. Sorry to be so persistent about it, but each time I see some unsafe design out of fear from limitedness, I hope things could be different. ------------- !topic Finalization via a generic instantiation !from Philippe Kruchten Rational 1992-08-12 14:02:05 <> !reference MS-7.4.6();4.6 !keywords controlled types, finalization !discussion As expressed during the last DR meeting, I do not like very much the current way to define finalization. First, it calls for multiple inheritance, because there will be cases where you would like to derive from a user-defined type _and_ make the new type controlled. There is no simple way to do this. Second, it is very sensitive to typos: if instead of declaring a procedure FINALIZE, I call it (by mistake) FINALISE (I am a silly frenchman), it will never been called; not detecting at compile-time this kind of mistake is really the worst a programming language can do: chances are that the problem will go undetected (even after code reviews and QA), until the program is actually put to intensive use and dies because of Storage_Error. If you absolutely do not want to go back to the procedure T'Finalize of an earlier version of the mapping document (version 2.?) my colleague Pascal Leroy suggests using a predefined generic (a la Unchecked_Conversion): generic type T is limited private; with procedure INITIALIZE (OBJECT : in out T); with procedure FINALIZE (OBJECT : in out T); package CONTROLLED is end CONTROLLED; Instantiating this generic in the same declarative part as the type itself would make the type controlled, and register both procedures to be called during initialization/finalization of object of that type. This would combine nicely with inheritance, and would be allow for detection of typing errors. ------------- !topic Finalization via a generic instantiation !from Norman Cohen 1992-08-31 10:23:33 <> !reference MS-7.4.6();4.6 !reference 92-1293.a !keywords controlled types, finalization !discussion Philippe writes: > If you absolutely do not want to go back to the > procedure T'Finalize > of an earlier version of the mapping document (version 2.?) my colleague > Pascal Leroy suggests using a predefined generic (a la Unchecked_Conversion): > > generic > type T is limited private; > with procedure INITIALIZE (OBJECT : in out T); > with procedure FINALIZE (OBJECT : in out T); > package CONTROLLED is > end CONTROLLED; > > Instantiating this generic in the same declarative part as the type > itself would make the type controlled, and register both procedures > to be called during initialization/finalization of object of that > type. This would combine nicely with inheritance, and would be > allow for detection of typing errors. These are compelling arguments, but the idea of a type being "made" controlled by its appearance as a generic actual parameter strikes me as a strange form of magic. And what is the effect of instantiating CONTROLLED more than once with the same type? The following small variation on Pascal Leroy's idea answers these objections: generic type T is tagged limited private; with procedure INITIALIZE (OBJECT : in out T); with procedure FINALIZE (OBJECT : in out T); package CONTROLLED is type CONTROLLED_T is new T with private; private -- implementation-defined link information end CONTROLLED; Each instantiation produces a new type CONTROLLED_T which is derived from T, but controlled. If T itself is controlled, the effect is to replace the INTIALIZE and FINALIZE of T in CONTROLLED_T. Norman ------------- !topic Finalization via a generic instantiation !from Bevin Brett 1992-09-01 09:18:11 <> !reference MS-7.4.6();4.6 !reference 92-1293.a !reference 92-1331.a !keywords controlled types, finalization !discussion I think it would be simpler to have a representation clause rather than a magic generic instantiation. In particular, the generic instantiation idea requires an initialisation subprogram, which leads to the amusing problem described below... Simple Something: ----------------- package P is type T is ... procedure T_Init(X : in out T); for T'init use T_Init; procedure T_Fini(X : in out T); for T'fini use T_Fini; ... end; Note that "T_Init(X : in out T)" is correct, because you want discrims, access components, and record component := exp initialisations to be carried out before T_Init is called. Amusing ABE problem: -------------------- Note that, regardless of how you specify the Initialiser and Finaliser, there is an amusing ABE problem if you specify any objects between the type and the body of the Initialiser. This is going to severely cramp the declaration of finalised objects in package specifications. Especially since they will usually be record types, and these are the class of types least supported by generic formals [the only way of getting an body into a package spec being by instantiation]. /Bevin ------------- !topic Finalization via a generic instantiation !from Robert Eachus 1992-09-01 10:54:59 <> !reference MS-7.4.6();4.6 !reference 92-1293.a !reference 92-1331.a !keywords controlled types, finalization !discussion Norm writes: > The following small variation on Pascal Leroy's idea answers these > objections: > > generic > type T is tagged limited private; > with procedure INITIALIZE (OBJECT : in out T); > with procedure FINALIZE (OBJECT : in out T); > package CONTROLLED is > type CONTROLLED_T is new T with private; > private > -- implementation-defined link information > end CONTROLLED; > > Each instantiation produces a new type CONTROLLED_T which is derived > from T, but controlled. If T itself is controlled, the effect is to > replace the INTIALIZE and FINALIZE of T in CONTROLLED_T. I like it a lot... There is one question I have though, which might require a slight change. If T is a non-limited type, is CONTROLLED_T limited? I don't see it as being required by this definition, but an implementation could add fields of limited types. So provide this as a predefined package, and either require that instances of CONTROLLED_T always be limited, or that implementations not force this. I don't see any reason why link information has to be part of the visible object, the compiler could store it in a separate structure. If so, the specification should require that the user-defined FINALIZE operation be invoked on the target of an assignment. The more I think about it, the more I like the idea of being able to create non-limited types with finalization. Using references counts it becomes quite trivial to create a fully general structure with full garbage collection. (To clarify by oversimplifying, the reference counts would track the number of references from outside to the managed heap. It is then possible to collect the heap by treating all objects with references from outside as marked. Relocation can be done using a linked list.) Robert I. Eachus with STANDARD_DISCLAIMER; use STANDARD_DISCLAIMER; function MESSAGE (TEXT: in CLEVER_IDEAS) return BETTER_IDEAS is... ------------- !topic Finalization via a generic instantiation !from Robert Eachus 1992-09-02 09:34:04 <> !reference MS-7.4.6();4.6 !reference 92-1293.a !reference 92-1331.a !reference 92-1338.a !keywords controlled types, finalization !discussion Bevin said: > Note that, regardless of how you specify the Initialiser and > Finaliser, there is an amusing ABE problem if you specify any > objects between the type and the body of the Initialiser. > This is going to severely cramp the declaration of finalised > objects in package specifications. Especially since they will > usually be record types, and these are the class of types least > supported by generic formals [the only way of getting an body into > a package spec being by instantiation]. But this is not true with the generic approach! The operrations passed in are on the base type not the extended type, so the user can write his portion of the controlled type without such worries. The only limitation is that the initialize and finalize operations of the component type (T) cannot directly depend on operations on the controlled type (CONTROLLED_T). (If you are a glutton for punishment it should be possible with the generic approach to write an initialization routine which indirectly references objects of type CONTROLLED_T and avoids recursion, but it is going to be tricky to make such code non-erroneous.) ------------- !topic Finalization via a generic instantiation; Abstract Types !from Tucker Taft 1992-08-31 11:55:35 <> !reference MS-7.4.6();4.6 !reference 92-1293.a !reference 92-1307.a !reference 92-1308.a !reference 92-1331.a !keywords controlled types, finalization, abstract !discussion > Philippe writes: > > > If you absolutely do not want to go back to the > > procedure T'Finalize > > of an earlier version of the mapping document (version 2.?) my colleague > > Pascal Leroy suggests using a predefined generic . . . > > . . . Norm writes (in 92-1331.a): > These are compelling arguments, but the idea of a type being "made" > controlled by its appearance as a generic actual parameter strikes me as > a strange form of magic. And what is the effect of instantiating > CONTROLLED more than once with the same type? > > The following small variation on Pascal Leroy's idea answers these > objections: > > generic > type T is tagged limited private; > with procedure INITIALIZE (OBJECT : in out T); > with procedure FINALIZE (OBJECT : in out T); > package CONTROLLED is > type CONTROLLED_T is new T with private; > private > -- implementation-defined link information > end CONTROLLED; > > Each instantiation produces a new type CONTROLLED_T which is derived > from T, but controlled. If T itself is controlled, the effect is to > replace the INTIALIZE and FINALIZE of T in CONTROLLED_T. If you will allow me, here is the implementation of this package in Ada 9X: generic . . . -- as above private type MIX_IN(OBJ : access CONTROLLED_T) is new SYSTEM.CONTROLLED with null; -- A mix-in to provide initialization/finalization; -- The access discriminant is initialized automatically (below) procedure INITIALIZE(MIX : in out MIX_IN); procedure FINALIZE (MIX : in out MIX_IN); type CONTROLLED_T is new T with record MIX : MIX_IN(CONTROLLED_T'ACCESS); -- the initialization/finalization of this component -- will call the user-provided INITIALIZE/FINALIZE ops. end record; end CONTROLLED; package body CONTROLLED is procedure INITIALIZE(MIX : in out MIX_IN) is begin INITIALIZE(T(MIX.OBJ.all)); end INITIALIZE; procedure FINALIZE(MIX : in out MIX_IN) is begin FINALIZE(T(MIX.OBJ.all)); end FINALIZE; end CONTROLLED; You can actually make this a little more flexible if you want by restructuring it a bit, as follows: generic type T(<>) is tagged limited private; -- The (<>) allows discriminated types as well; -- this could be added to the earlier package also. package MAKE_CONTROLLED is type CONTROLLED_T is new T with private; procedure INITIALIZE(OBJECT : in out CONTROLLED_T) is <>; procedure FINALIZE (OBJECT : in out CONTROLLED_T) is <>; private type MIX_IN(OBJ : access CONTROLLED_T'CLASS) is new SYSTEM.CONTROLLED with null; procedure INITIALIZE(MIX : in out MIX_IN); procedure FINALIZE (MIX : in out MIX_IN); type CONTROLLED_T is new T with record MIX : MIX_IN(CONTROLLED_T'ACCESS); -- the initialization/finalization of this component -- will dispatch to the appropriate INITIALIZE/FINALIZE ops. end record; end MAKE_CONTROLLED; package body MAKE_CONTROLLED is procedure INITIALIZE(MIX : in out MIX_IN) is begin INITIALIZE(MIX.OBJ.all); -- dispatching call end INITIALIZE; procedure FINALIZE(MIX : in out MIX_IN) is begin FINALIZE(MIX.OBJ.all); -- dispatching call end FINALIZE; end MAKE_CONTROLLED; Now the instantiation of MAKE_CONTROLLED produces an abstract root type that can be further derived to add INITIALIZE/FINALIZE operations, without requiring additional instantiations to add further levels to the hierarchy. On a related topic... One simple approach to the misspelling concern that Philippe mentioned is to make CONTROLLED an abstract type, with at least the FINALIZE operation abstract (since that is the one that is of most interest). We felt that INITIALIZE should not be abstract, given how often default initialization would be adequate, and we made FINALIZE non-abstract out of symmetry arguments. However, given Philippe's concern, it seems to make sense to break the symmetry a bit, and make FINALIZE abstract. By the way, as we have refined the definition of private extensions and abstract types, they now work quite nicely together and in conjunction with generic mix-ins. Some of these ideas came from a suggestion by Ed Seidewitz that "type X is new Y with private" should simply say that X is *some* derivative of Y, direct or indirect. It does not say that X is a direct derivative of Y. We have also adopted, in part, a suggestion by Jim Hassett of Paramax, that not all of the abstract operations have to be respecified if the derived type is also going to be abstract, so long as at least one of the explicitly declared operations of the derived type is abstract. Finally, as we mentioned in an earlier message, whether an operation is going to be overridden need not be exposed in the visible part, but can instead be postponed until after the full definition of a private extension. Taken together, these make it quite easy to use System.Controlled as a visible ancestor type when declaring a private tagged limited type, without committing the private type to being a direct descendant of System.Controlled. For example, one can generally say: type T is new System.Controlled with private; . . . private type T is new Q with record ... So long as Q is itself visibly derived from System.Controlled. This means that "new System.Controlled with private" is essentially similar to introducing a new reserved word "controlled" so one could say "type T is tagged controlled private;" which would mean that the full type is tagged, limited, and has Initialize and Finalize dispatching operations. By the way, abstract subprograms can help a little with spelling problems, but in general, if one is going to want to override a non-abstract subprogram, one will have to get used to spelling the overriding declaration carefully. This is inherent in the original Ada 83 model of derivable subprograms, and is shared by most other OOPs, except perhaps those that require an explicit "OVERRIDES" reserved word on overriding declarations. (Of course, if one both misspells and forgets the OVERRIDES you are back in the same soup...) As a final shot on yet another related topic... Norm writes in 92-1308.a: > While we're on the subject, I want to reiterate in writing that I find > the term "abstract type" unfortunate because of the potential for > confusion with the software-engineering term "abstract data type" > (which typically corresponds in Ada to a private type). I see no reason > why we should be reluctant to borrow the a term from Eiffel ("deferred > type") or C++ ("virtual type"). The term "abstract class" was invented (as far as I know) by the designers of SmallTalk, and now pervades the general OOP literature. It *is* the term used in C++ for the same concept. "Virtual type," on the other hand, is not a C++ term, though "virtual base type" is, and means something completely different (controls whether the components of a multiply inherited parent type appear once overall or once per inheritance in the derivative). The Eiffel-inspired term "deferred type" seems inconsistent with the existing Ada use for "deferred" (as in deferred constant) where the definition is not omitted, but rather simply deferred to the private part. An Eiffel "deferred class" is really never further defined, although subclasses are allowed, which presumably are considered the collective target of the "deferral." That makes some sense in Eiffel since "classes" inherently include their subclasses. In Ada 9X, a type does not include its derivatives. One might plausibly call an abstract type a "root of a deferred class" but that hardly seems to be an improvement. Ada has often been accused of inventing its own terminology. In the case of "abstract type" we seem to be following the tradition established in the OOP literature, which seems like a good idea. And in any case, "abstract type" and "abstract data type" are not antonyms. If you really want do define a "pure" abstract data type, with no single concrete implementation, then an Ada 9X "abstract type" is the ideal way to do that. An abstract type T is an abstract data type with no single implementation, but with many possible implementations collectively accessible via T'CLASS. -Tuck ------------- !topic Initialization via a generic instantiation; Abstract Types !from Anthony Gargaro 1992-09-10 12:23:10 <> !reference MS-7.4.6();4.6 !reference 92-1293.a !reference 92-1307.a !reference 92-1308.a !reference 92-1331.a !reference 92-1334.a !keywords controlled types, initialization, abstra !discussion STT> You can actually make this a little more flexible if you want STT> by restructuring it a bit, as follows: STT> generic STT> type T(<>) is tagged limited private; STT> -- The (<>) allows discriminated types as well; STT> -- this could be added to the earlier package also. STT> package MAKE_CONTROLLED is STT> type CONTROLLED_T is new T with private; STT> procedure INITIALIZE(OBJECT : in out CONTROLLED_T) is <>; STT> procedure FINALIZE (OBJECT : in out CONTROLLED_T) is <>; STT> private STT> type MIX_IN(OBJ : access CONTROLLED_T'CLASS) is STT> new SYSTEM.CONTROLLED with null; STT> procedure INITIALIZE(MIX : in out MIX_IN); STT> procedure FINALIZE (MIX : in out MIX_IN); STT> type CONTROLLED_T is new T with record STT> MIX : MIX_IN(CONTROLLED_T'ACCESS); STT> -- the initialization/finalization of this component STT> -- will dispatch to the appropriate INITIALIZE/FINALIZE ops. STT> end record; STT> end MAKE_CONTROLLED; STT> package body MAKE_CONTROLLED is STT> procedure INITIALIZE(MIX : in out MIX_IN) is STT> begin STT> INITIALIZE(MIX.OBJ.all); -- dispatching call STT> end INITIALIZE; This is quite an elegant and persuasive exposition of the utility of access discriminants. However, the above example is difficult to reconcile with the wording of MS-3.8.2(2);4.6 that states (capitalization added) "Within the extension part, the value of an access discriminant (see 3.6.1) of a component may be specified by an access attribute whose prefix is the simple name of the derived type being declared. When used in this way, the attribute designates the enclosing object being INITIALIZED." In the above example, the default initialization operation for CONTROLLED_T has been made abstract with the intent that its derivatives will add the operation. It seems there is a questionnable dependency that in order to dispatch to this initialization operation, a default initialization operation must exist. ------------- !topic LSN on Finalization in Ada 9X !from Bob Duff $Date: 1992-10-10 14:05:12 <> !reference MS-7.4.6();4.6 !reference MI-OO03 !discussion (I also want to say, "reference MS-7.4.6;2.0", but our tool doesn't like version 2.0.) This Language Study Note discusses User-defined Finalization. At the Frankfurt WG9 meeting, the concensus was that finalization should be kept in the language, and should be moved to the core language. (It was at that time in the SP Annex.) However, two countries voted that the MRT should study the issue of removing a certain restriction from the proposal, and we heard the same concern from several DRs. This LSN addresses that concern. The restriction in question is that the user must decide whether or not to have finalization at the root type of a hierarchy. Finalization cannot be added to a derived type, unless the parent type has finalization. The MRT has already made several proposals that address the concern (see the MI and old version of the MS referenced above). This LSN summarizes some of that information. We need to decide whether the removal of the above restriction is worth the cost. ISSUES ADDRESSED BY ALL PROPOSALS: First, we discuss some issues that must be addressed by any finalization proposal. ISSUE 1: SYNTAX. Three different proposals for the syntax have existed at one point or another. MI-OO03 proposed this: procedure My_Finalize_Operation(X: T); for T'Finalize use My_Finalize_Operation; MS;2.0 proposed this: procedure T'Finalize(X: in out T); MS;4.6 proposes no extra syntax for finalization. Instead, the user indicates finalization by deriving from a special tagged type. ISSUE 2: INITIALIZE AND FINALIZE ARE ONE-FOR-ONE. It is obviously a bad idea to finalize uninitialized variables. Since finalization is done automatically, the user has no control over such erroneousness. Therefore, the rules must ensure that if a variable is not initialized, it is not finalized. It is also a bad idea to miss any finalizations. If an object IS initialized, then it should be finalized also. The conclusion is that finalization should be done if and only if initialization is done. This implies that initialization and finalization should be abort-deferred regions. ISSUE 3: IS USER-DEFINED COPY ALLOWED? Finalization has certain interactions with copying. Ada performs a copy of an object in the following cases: - Default initialization of components, for object_declarations and allocators. - Explicit initialization of stand-alone objects, and of objects allocated by allocators. - Assignment statements. - Parameter passing and function return. - Input-Output, 'READ/'WRITE, unchecked_conversion, interface to other languages, and the like. The language-defined copy (which just copies the bits) is fundamentally incompatible with user-defined finalization. When the programmer defines finalization for a type, it is generally because the type contains a reference to some resource that needs to be deallocated, unlocked, or otherwise manipulated. Such references won't work if clients of the abstraction are making copies willy-nilly. For example, suppose I have a data type that contains a varying-length text string, so I use an access type. (In this case the "resource" is simply the heap.) package P is type T is [tagged?] [limited?] private; ... private type String_Acc is access String; type T is [tagged?] [limited?] record Name: String_Acc; ... -- other components. end record; end P; In order for the abstraction to be abstract, I must define a finalization operation that does something to the Name component whenever an object of type T disappears. For example, I might deallocate it. Or I might manipulate a reference count. In any case, if a client is allowed to make copies of such objects without giving me control, my abstraction won't work right -- Names will get deallocated twice, or reference counts will not be correctly maintained. In a tasking application, a lock might be unlocked twice. All of this means that any finalization proposal must either prevent copies, or must allow the user to define the semantics of the copy. In the above example, the user's copy operation would do a deep copy (i.e. allocate a new Name), or increment a reference count, or some such thing. Previous Mapping proposals have allowed the user to define a T'COPY operation. As explained above, the interesting operation is user-defined copy, not just user-defined assignment. In any case, a finalization proposal that allows copy, but not user-defined copy, won't fly. If user-defined copy is allowed, it should not be applied when inside the immediate scope of the type itself. Or perhaps for operations that are declared in that immediate scope. Otherwise, how can one define the copy operation in terms of assignment? ISSUE 4: INTENDED IMPLEMENTATION STRATEGY. The MRT believes that the only feasible implementation strategy is to add objects to a per-task list when they are initialized, and use the list to find them when they need to be finalized. In order to avoid the inefficiency and uncertainty of heap allocation, the list must be threaded through links that are in the object itself. Implementations are, of course, free to choose any strategy that they can get to work. However, we believe that a PC-map-based strategy would be quite complicated. Consider, for example, an array of finalizable components. Suppose that while initializing the array, an exception occurs during initialization of the seventeenth component. According to ISSUE 2, we must ensure that exactly those components that were initialized must be finalized. How do we remember which those were? Recall that initialization of the array need not take place in any particular order. With the per-task list strategy, it's easy -- if it's on the list, then it got initialized. Which types contain link fields, and what offset are they allocated at? One alternative is to allocate them in every object that might have finalization. (The different proposals differ as to how many such types there are, and therefore differ as to how tolerable it is to have overhead in every such object.) If they are always present, then they can always be allocated at the same offset, thus simplifying the process of finding them. Another alternative is to allocate them only in types that actually have finalization. But if finalization can be added anywhere in the hierarchy (rather than just at the root type), they cannot be allocated at the same offset in all objects. This raises the same run-time issues as multiple inheritance -- fields are at different offsets, so extra data structures are needed in order to find them at run time. Another alternative is to allocate them only in types that actually have finalization, but to put them at a negative offset from the beginning of the object. There are some drawbacks to this approach: some compilers don't know how to deal with negative record offsets. It makes the implementation more complex to be allocating record fields in two directions. On some machines, negative offsets are rather inefficient. Negative offsets may already be needed for other purposes. For example, some have noted that allocating discriminants at negative offsets can make tagged types more efficient. In addition to the links, each object must contain some method of finding the finalization operation. The same issues -- whether and where to allocate this field -- arise. ISSUE 5: FINALIZATION OF CONSTANTS. It makes sense for the finalization operation to take an 'in out' parameter of the type. However, this won't work if there might be constants of the type. ISSUE 6: ALLOWED FOR NESTED TYPES? If finalization is allowed for a type that is nested in a subprogram or task body, then when finalization occurs, it is necessary to set up the appropriate static link or display to represent the correct static context. ISSUE 7: COMPOSABILITY. If a composite type has components with finalization, does the composite type have a finalization operation composed from those if the components? If it also has its own finalization, when is that done? If a parent type has finalization, does a derived type inherit it? If the derived type has its own finalization, does this finalization override that of the parent, or is it in addition to that of the parent? Note that for tagged types, if composition is automatic on derivation, the compiler needs to construct the composite finalization operations. Consider, for example, what happens on finalization of an object with an unknown tag: type A is access T'CLASS; X: A; ... -- make X point to an object somewhere in the class FREE(X); -- X.all is finalized at this point. A tagged type may be both derived from something with finalization, and have additional components with finalization, and also have its own finalization. If these things compose, then in what order? Similar questions arise for the copy operation, if it is user-definable. Whatever the answers are, are they uniform with the way user-defined equality works? Is it necessary to change equality in an upward incompatible manner? ISSUE 8: DECISIONS MADE AT ROOT TYPE? Does the user have to decide whether or not to have finalization at the root of a hierarchy of derived types, or can finalization be added in later? This is the concern that this LSN is trying to address -- making decisions at the root type is restrictive. ---------------- Now we turn to the individual proposals, and discuss how they address each of the above issues: - PROPOSAL A: FULL-BLOWN FINALIZATION - PROPOSAL B: COPY ONLY FOR LIMITED TYPES - PROPOSAL C: NO USER-DEFINED COPY. - PROPOSAL D: RESTRICTION TO ROOT TYPE. - PROPOSAL E: DERIVATION FROM CONTROLLED ---------------------------------------------------------------- PROPOSAL A: FULL-BLOWN FINALIZATION This is similar to the MI-OO03 proposal. Finalization is allowed for any type. ISSUE 1: SYNTAX. We'll use this syntax, since people seem to like it better than the MS;2.0 proposal: procedure My_Finalize_Operation(X: T); for T'Finalize use My_Finalize_Operation; Open issue: Is the fact that finalization exists for a given type an abstract property of the type? Should there be rules about the placement of "for T'FINALIZE use..." that ensure it is exported from a package (for a private type)? In other words, should the programmer of a derived type be able to know whether the parent type has finalization? ISSUE 2: INITIALIZE AND FINALIZE ARE ONE-FOR-ONE. By definition. ISSUE 3: IS USER-DEFINED COPY ALLOWED? Yes. Some syntax such as "for T'COPY use ..." could be used. In addition, we prevent copy for parameter-passing and function return by saying that all finalizable types are passed by reference. Possible problem: the compiler doesn't always know whether an object is finalizable, since a derived type might have added finalization. Possible problem: Elementary types are now sometimes required to be passed by reference. Possible problems: User-defined copy on non-limited types is a bag of worms. ISSUE 4: INTENDED IMPLEMENTATION STRATEGY. It's not clear how to optimize away the links for (the majority of) objects that have no finalization. A similar comment applies to the pointer (or whatever) to the finalize operation. ISSUE 5: FINALIZATION OF CONSTANTS. Presumably, finalization must take an 'in' parameter, since constants are generally allowed. This means that if the finalization operation wants to pass a component to an instance of UNCHECKED_DEALLOCATION, it must make a copy of that component into a temp. ISSUE 6: ALLOWED FOR NESTED TYPES? Open issue. The simplest semantically is to say Yes. But there could be an explicit restriction in the language. ISSUE 7: COMPOSABILITY. Finalization of components happens after finalization of the composite object (because finalization of the object might access components). Finalization of a parent type happens after finalization of the derived type (because the derived type's finalization might access components of the parent). I don't know what the right answer is in the case where there are both components and a parent. The copy operation of a type composes in the expected way from user-defined copy. This makes it non-uniform with user-defined equality -- but that's better than upward incompatibility. ISSUE 8: DECISIONS MADE AT ROOT TYPE? Finalization may be added at any point in the type hierarchy. Note the relationship of Issues 7 and 8. If finalization can be added anywhere, it is important that finalization compose on derivation. Suppose T2 is derived from T1, and that T2 has finalization, but T1 does not. Suppose somebody wants to add finalization for T1. That somebody can't know about the existence of T2, in general. Therefore, it better not be the resposibility of T2 to call T1's finalization. On the other hand, if finalization can only be added for a root type, then this scenario cannot occur. The programmer of T2 will always know whether T1 has finalization, and can therefore call it explicitly. ---------------------------------------------------------------- PROPOSAL B: COPY ONLY FOR LIMITED TYPES This is the same as Proposal A, except that the user-defined copy operation is only allowed on a limited type. For a type that becomes non-limited after its declaration, the copy operation is not invoked in certain regions. It's not clear whether defining a copy operation automatically makes a limited type become non-limited. In any case, it should be legal for a type extension of a non-limited type to contain a limited component, so long as the user defines copy and "=" operations. Or perhaps not doing so would make the extension abstract. (Recall that an important aspect of our OOP proposal is that operations cannot be subtracted on type extension. When doing a dispatching call, it is not necessarily known at compile time what body will be executed, but it IS known that there is a body to execute.) If one needs to finalize an access type (say, for doing reference counting), one must put the access type inside a limited record, and finalize the record. ISSUE 1: SYNTAX. As for Proposal A. ISSUE 2: INITIALIZE AND FINALIZE ARE ONE-FOR-ONE. By definition. ISSUE 3: IS USER-DEFINED COPY ALLOWED? Yes. But only for limited types. ISSUE 4: INTENDED IMPLEMENTATION STRATEGY. Now only limited types need the overhead of the link fields, and the pointer to the finalize operation. ISSUE 5: FINALIZATION OF CONSTANTS. No problem -- constants never exist. If the type becomes non-limited, then finalization is not done in that region. ISSUE 6: ALLOWED FOR NESTED TYPES? Same as Proposal A. ISSUE 7: COMPOSABILITY. As for Proposal A. ISSUE 8: DECISIONS MADE AT ROOT TYPE? Finalization may be added at any point in the type hierarchy. ---------------------------------------------------------------- PROPOSAL C: NO USER-DEFINED COPY. Finalization is allowed for any limited type. User-defined copy is not allowed. Any type with finalization is passed by reference, even if it turns out to be an elementary type. ISSUE 1: SYNTAX. Same as proposals A and B. ISSUE 2: INITIALIZE AND FINALIZE ARE ONE-FOR-ONE. By definition. ISSUE 3: IS USER-DEFINED COPY ALLOWED? No. But most finalizable types are not copied a lot anyway. Parameter passing and function return are by-reference. ISSUE 4: INTENDED IMPLEMENTATION STRATEGY. Same as proposal B. ISSUE 5: FINALIZATION OF CONSTANTS. No problem -- same as Proposal B. ISSUE 6: ALLOWED FOR NESTED TYPES? Same as Proposals A and B. ISSUE 7: COMPOSABILITY. As for Proposals A and B. ISSUE 8: DECISIONS MADE AT ROOT TYPE? Finalization may be added at any point in the type hierarchy. However, limitedness must be decided at the root type, since the user cannot add a copy operation to a derived type. Therefore, if the user wants to add finalization to a derived type, he will often have to go back and change the root type to make it limited. ---------------------------------------------------------------- PROPOSAL D: RESTRICTION TO ROOT TYPE. Same as Proposal C, except that when you say "for T'FINALIZE use...", T must be a non-derived type, or it must be a derived type whose parent has finalization. ISSUE 1: SYNTAX. Same as Proposals A, B, and C. ISSUE 2: INITIALIZE AND FINALIZE ARE ONE-FOR-ONE. By definition. ISSUE 3: IS USER-DEFINED COPY ALLOWED? No. ISSUE 4: INTENDED IMPLEMENTATION STRATEGY. The link fields, plus the pointer to the finalization operation, can be allocated at the same place in each object, and only when actually needed. ISSUE 5: FINALIZATION OF CONSTANTS. No problem -- same as Proposals B and C. ISSUE 6: ALLOWED FOR NESTED TYPES? Same as Proposals A, B and C. ISSUE 7: COMPOSABILITY. As for Proposals A, B, and C. ISSUE 8: DECISIONS MADE AT ROOT TYPE? Yes, both limitedness and finalization are decided upon at the root type, and cannot be changed for a derived type. ---------------------------------------------------------------- PROPOSAL E: DERIVATION FROM CONTROLLED This is essentially the proposal documented in MS;4.6. The user makes a type finalizable by deriving from a special tagged type called CONTROLLED, which has INITIALIZE and FINALIZE operations. One change from MS;4.6 is that FINALIZE is an abstract operation, which means that spelling errors will be caught at compile time. One advantage is that many of the semantic issues are handled for free -- no need for a lot of special rules about finalizable objects, since most of those rules follow from the rules about type extension. Similarly, many of the implementation issues go away -- see, for example, below where we describe how the link fields are allocated. ISSUE 1: SYNTAX. No syntax. ISSUE 2: INITIALIZE AND FINALIZE ARE ONE-FOR-ONE. By definition. ISSUE 3: IS USER-DEFINED COPY ALLOWED? No. CONTROLLED is a limited type, so all of its desendants are limited. This prevents most copy operations (e.g. assignment). Parameter passing and function return are by reference. Controlled types are "inherently limited" which means that all views of the type are limited. Inherently limited types do not become non-limited. This means that there is no region in which copy is allowed; therefore no region in which finalization should be turned off. (Note that the "inherently limited" concept is needed in any case; it would be disastrous, for example, to pass a protected object be reference.) ISSUE 4: INTENDED IMPLEMENTATION STRATEGY. The link fields are simply declared in type CONTROLLED, and are inherited by all controlled types. In the normal OOP implementation strategy, the component offsets do not change. There is no need for a special pointer from each object to the finalization operation. Instead, the tag field is used to find the finalization operation in the normal way -- by doing a dispatching call. In the other proposals, the link fields were hidden dope added by the implementation. In this proposal, however, the user is aware of their existence -- the type is explicitly derived from CONTROLLED, so it makes sense that whatever fields CONTROLLED has (even if private) also exist in the user's type. Finalization of an object of unknown tag is automatic, since FINALIZE is a dispatching operation, and the user has control over composition with the parent type. ISSUE 5: FINALIZATION OF CONSTANTS. Not necessary. The finalize operation has an 'in out' parameter. ISSUE 6: ALLOWED FOR NESTED TYPES? No. Type CONTROLLED is declared at library level, and the normal rules of type extension prevent controlled types from being more deeply nested. ISSUE 7: COMPOSABILITY. If a component needs finalization, it can be controlled, too. Component finalization happens after that of the composite object. Finalization for multiple components is done in the reverse order of initialization (which is done in an arbitrary order). A derived type's finalization overrides that of the parent. If the user wishes to make both happen, then the derived finalization should call the parent's, in the usual "pass-the-buck" OOP style. This could be viewed as an advantage or a disadvantage: it is less automatic, but the user has control of when (and if) it happens. Note that it is the answer to Issue 8 that allows this simplification to be safe. ISSUE 8: DECISIONS MADE AT ROOT TYPE? Yes, both limitedness and finalization are decided upon at the root type, and cannot be changed for a derived type. However, in most examples we have worked with, the purpose of adding finalization to a type extension is to finalize a new component. But simply making that component be controlled solves the problem in most cases. In some cases, the new finalization is really for the object as a whole, in which case one must define a type for the component that has an access discriminant. The access discriminant is then initialized to point to the containing object. This method is outlined in LSN-1033. The add-a-controlled-component mechanism can be encapsulated in a generic package, which takes a finalization operation as a formal procedure. This method has been documented elsewhere. Another workaround is to have a coding convention: Make all types (or all "interesting" types) derived from CONTROLLED, just in case someone wants to add finalization later. There are probably some applications where such a coding convention makes sense. But applications where the per-object overhead is intolerable will be happy that the "coding convention" has not be codified as a language rule. ---------------------------------------------------------------- COMPARISON OF PROPOSALS: Proposal A is far too complex, and introduces too many semantic anomolies. (What, for example, happens if a discriminant is of a finalizable type? Discriminants are not passed by reference, surely.) Proposal B eliminates the restriction that people are concerned about. However, it requires the user-defined copy feature. Proposal C eliminates the restriction, but there is still the restriction that you can't add limitedness to a derived type. The ability to add finalization is of small comfort to the programmer who has to go back to the root type to make it limited. Removing the restriction from limited requires user-defined copy, turning this into Proposal B. Proposal D doesn't even address the concerns that were raised, although there is a workaround. It is mostly the same as Proposal E, except that restrictions must be stated explicitly in the RM, instead of following logically from the existing restrictions on type extensions. Proposal D also raises the composability issue -- how to define it. Proposal E doesn't address the concern either, although there is a workaround. Its advantage is ease of description and ease of implementation. Proposal B is the simplest one that fully addresses the concern. Proposal E is the simplest one that does not address the concern. Therefore, I think the choice boils down to whether the benefits of Proposal B are worth the cost. ------------- !topic LSN on Finalization in Ada 9X !from Stef Van Vlierberghe 1992-10-15 23:02:16 <> !reference MS-7.4.6();4.6 !reference 92-1334.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion Could someone correct the following interpretation of both LSN's and the generic finalization, in case it is a misunderstanding ? The example I chose is an abstract data type T that is implemented as an access type A designating a type D. A needs finalization. D contains just a NEXT pointer. The finalization-list implementation model : -------------------------------------------- I instantiate package MAKE_CONTROLLED (see 92-1334.a) to obtain a controlled type holding an A value. type D; type A is access D; type T is new System.Controlled with record PTR : A; end record; procedure INITIALIZE (OBJECT : in out T); procedure FINALIZE (OBJECT : in out T); package I is new CONTROLLED ( T, INITIALIZE, FINALIZE ); type D is record NEXT : I.CONTROLLED_T; -- Normally VALUE would be called NEXT and we would find -- some DATA here... end record; Then the memory map of the designated object looks like below (not counting on additional internal descriptors). *--------------------------* A ---> | Next_Controlled_Object | -> double linked on the heap *--------------------------* <- | Storage_Pool_Follower | *--------------------------* ---- | First_Component | single linked components | *--------------------------* | | Controlled_A'TAG | <---- must become a tagged type | *--------------------------* | | | NEXT | | the real information | *--------------------------* | --> | Next_Controlled_Object | -| | single link to next component *--------------------------* | | Storage_Pool_Follower | -| | Used in the heap only *--------------------------* | | Access_discriminant | ----- To know what to finalize. *--------------------------* There may be more logical orderings, but I believe all these component will be there. The PC based model : -------------------- type D; type T is access D; procedure T'FINALIZE (OBJECT : in out T); type D is record NEXT : T; -- Normally VALUE would be called NEXT and we would find -- some DATA here... end record; And the memory map of D would look like : *--------------------------* A ---> | Controlled_A_Value | *--------------------------* ------------- !topic LSN on Finalization in Ada 9X !from Stef Van Vlierberghe 1992-10-15 23:02:22 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference LSN-1046 !keywords controlled types, finalization !discussion I would like to have another serious look at the problem of implementing the PC based model. Let's take the example given in the LSN : > Consider, for example, an array of finalizable components. Suppose > that while initializing the array, an exception occurs during > initialization of the seventeenth component. According to ISSUE 2, we > must ensure that exactly those components that were initialized must > be finalized. How do we remember which those were? As this partial elaboration problem can only arise during elaboration, let's consider to generate (at compile time) for each type, an (internal) operation T'ELABORATE which will either completely elaborate, or finalize what has been elaborated : type COMP_T is ...; -- For this type we already generated : procedure ELABORATE ( COMP : in out COMP_T ) is ...; type ARRAY_T is array ( INDEX_T range <> ) of COMP_T; -- For this type we generate : procedure ARRAY_T'ELABORATE ( OBJ : in out ARRAY_T ) is begin UNTIL_EXCEPTION : for I in OBJ'RANGE loop begin COMP_T'ELABORATE ( OBJ(I) ); exception when others => if I /= OBJ'FIRST then for J in OBJ'FIRST .. INDEX_T'PRED(I) loop COMP_T'FINALIZE( OBJ(I) ); end loop; end if; raise; end loop UNTIL_EXCEPTION; end ARRAY_T'ELABORATE; Doesn't look that complex to me, let's look at a record type : type REC_T ( DISCRIM : DISCRIM_T ) is record case DISCRIM is when VAL_1 => COMP_1a, COMP1b, ... : COMP_T; when VAL_2 => COMP_2 : ARRAY_T (1..5); end case; end record; -- For this type we generate : procedure REC_T'ELABORATE ( OBJ : in out REC_T ) is begin case OBJ.DISCRIM is when VAL_1 => COMP_T'ELABORATE ( OBJ.COMP_1a ); begin COMP_T'ELABORATE ( OBJ.COMP_1b ); begin ... exception when others => COMP_T'FINALIZE( OBJ.COMP_1b ); raise; end; exception when others => COMP_T'FINALIZE( OBJ.COMP_1a ); raise; end; when VAL_2 => ARRAY_T'ELABORATE ( OBJ.COMP_2 ); end case; end REC_T'ELABORATE; Perhaps a little bit more difficult, but it basically still is a somewhat decorated copy of the type descriptor. This is assuming that finalization support is *not* supported for elementary types (see 92-1165.a if you are unconvinced that this is a reasonable restriction, or if you think it is a more severe restrinction that tagged and limitedness). So, perhaps the problem is a declarative part? : declare OBJ_1a, OBJ_1b : COMP_T; COMP_2 : ARRAY_T (1..5); begin SOME_CODE; exception ... end; but then again, this is largely equivalent to the case for records, and the finalization handlers required to undo partial elaboration can be used to finalize after exceptions in the regular case too : declare OBJ_1 : COMP_T; OBJ_2 : ARRAY_T (1..5); begin COMP_T'ELABORATE(OBJ_1); begin ARRAY_T'ELABORATE(OBJ_2); begin begin SOME_CODE; exception ... end; exception when others => ARRAY_T'FINALIZE (OBJ_2); raise; end; exception when others => COMP_T'FINALIZE (OBJ_1); raise; end; end; Of course, this would never work if done by a preprocessor, but surely it outlines well the implementation strategy to be used by the code generation. > Recall that initialization of the array need not take place in any > particular order. Not really a restriction on the implementation above, is it ? > With the per-task list strategy, it's easy -- if it's on the list, > then it got initialized. Acknowledged. But the impact on the user of imposing limitedness and taggedness, especially when you can't take limitedness away, is terrible. ------------- !topic LSN on Finalization in Ada 9X !from Tucker Taft 1992-10-16 16:48:52 <> !reference MS-7.4.6();4.6 !reference 92-1334.a !reference 92-1628.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion > Could someone correct the following interpretation of both LSN's and the > generic finalization, in case it is a misunderstanding ? > > The example I chose is an abstract data type T that is implemented as > an access type A designating a type D. A needs finalization. D contains > just a NEXT pointer. > > The finalization-list implementation model : > -------------------------------------------- > > I instantiate package MAKE_CONTROLLED (see 92-1334.a) to obtain a controlled > type holding an A value. There is no need to use the generic if the type is an extension of System.Controlled. The generic was a way to demonstrate how to add finalization as part of a type extension, when the parent type was not itself a "controlled" type. > type D; > type A is access D; > > type T is new System.Controlled with record > PTR : A; > end record; > > procedure INITIALIZE (OBJECT : in out T); > procedure FINALIZE (OBJECT : in out T); > > package I is new CONTROLLED ( T, INITIALIZE, FINALIZE ); This instantiation is redundant. Type T already has finalization. > type D is record > NEXT : I.CONTROLLED_T; NEXT : T; is all you need > -- Normally VALUE would be called NEXT and we would find > -- some DATA here... > end record; > > Then the memory map of the designated object looks like below (not counting > on additional internal descriptors). > > *--------------------------* > A ---> | Next_Controlled_Object | -> double linked on the heap > *--------------------------* > <- | Storage_Pool_Follower | > *--------------------------* > ---- | First_Component | single linked components > | *--------------------------* > | | Controlled_A'TAG | <---- must become a tagged type > | *--------------------------* | > | | NEXT | | the real information > | *--------------------------* | > --> | Next_Controlled_Object | -| | single link to next component > *--------------------------* | > | Storage_Pool_Follower | -| | Used in the heap only > *--------------------------* | > | Access_discriminant | ----- To know what to finalize. > *--------------------------* > > There may be more logical orderings, but I believe all these component > will be there. Once you get rid of the redundant generic, some of the above goes away. Also, the above doesn't correspond exactly to what was proposed in LSN-021, though of course LSN-021 was just suggesting one of presumably many different implementation approaches. What follows is what I would expect for a component declared as "Next : T", using the implementation model described in LSN-021. Note that this is not the required implementation model. We certainly allow a "PC-map"-based approach, though for objects in the heap some kind of linked list seems inevitable, since the PC can't tell you which heap objects have and have not been Unchecked-Deallocated. An alternative approach would only incur the overhead of the "Storage_Pool_Follower" back-pointer for objects allocated in the heap. *--------------------------* | T'TAG | *--------------------------* | Next_Controlled_Object | -> double linked on the heap *--------------------------* <- | Storage_Pool_Follower | *--------------------------* | NEXT | the real information *--------------------------* The Tag points to a descriptor which identifies the finalization operation to be performed. The Next_Controlled_Object is the link for use with a per-task finalization list, and it plus the Storage_Pool_Follower is for use with a per-storage-pool finalization list. > The PC based model : > -------------------- > > type D; > type T is access D; > > procedure T'FINALIZE (OBJECT : in out T); > > type D is record > NEXT : T; > -- Normally VALUE would be called NEXT and we would find > -- some DATA here... > end record; > > And the memory map of D would look like : > > *--------------------------* > A ---> | Controlled_A_Value | > *--------------------------* It is not clear how this would work on a heap if Unchecked_Deallocation is being used, since it must be possible to find all of the non-deallocated objects for finalization at the end of the scope of the collection/storage-pool. In any case, we don't mean to preclude the use of a PC-map-based approach, but we believe it will be much more implementation effort, requiring the insertion by the compiler of implicit exception and abortion/ATC handlers in the middle of expressions, declarative parts, subprogram calls (remember that access values must be passed by copy), assignment statements, aggregates, etc. From the very beginning, we have been worried about the implementation burden of Ada 9X in general, and finalization in particular. Our current approach allows a straightforward implementation using a linked-list approach, but it doesn't preclude a PC-map-based approach, or something in between. It does require that all such types be extensions of System.Controlled, implying at least a tag as space overhead. However, given that a generic like "Make_Controlled" can be implemented using System.Controlled, an implementor could special-case such a generic to provide a zero-overhead "controlled" type if they supported complete PC-based finalization, as follows: generic type Inner is limited private; with procedure Finalize(X : in out Inner); package Cheap_Finalization is type Outer is limited private; -- This type is finalizable. private -- Possible implementation, using a "magic" pragma, rather -- than deriving from System.Controlled type Outer is record X : Inner; end record; pragma PC_Map_Based_Finalization(Outer, Finalize); end Cheap_Finalization; You are still stuck with a limited type in this case. Supporting finalization on non-limited types is yet more work, requiring user-defined assignment and/or copy. In any case please believe that we are not against your developing a ground swell in favor of full finalization and user-defined assignment. Our earlier attempts were cut out due to scope concerns long ago, and your proposal of several months ago has had no official comments, either pro or con, from any other reviewer. We simply haven't had the time to pursue it further ourselves, and we are reluctant to devote our resoures to it until we see significant support for it from multiple reviewers, given our own sense that it is more implementation effort than most Ada vendors are willing to accept, particularly if you are going to require something like a PC-map-based approach. Note that even supporting function return of finalizable types has stirred protests from some reviewers. Going all the way to user-defined assignment will be harder. We would rather have the reviewers battle this one out themselves before getting us into the middle of the debate. -Tuck ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-22 23:03:51 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion > There is no need to use the generic if the type is an extension of > System.Controlled. The generic was a way to demonstrate how to add > finalization as part of a type extension Thank you for correcting this mistake, I wanted to look at the implementation model when using the generic, so the example should be : type D; type A is access D; procedure INITIALIZE (OBJECT : in out A); procedure FINALIZE (OBJECT : in out A); package I is new CONTROLLED ( T, INITIALIZE, FINALIZE ); type D is record NEXT : I.CONTROLLED_T; -- Normally VALUE would be called NEXT and we would find -- some DATA here... end record; > Once you get rid of the redundant generic, some of the above goes away. I expect only the Access_discriminant would go away. > though for objects in the heap some kind of linked list seems > inevitable, since the PC can't tell you which heap objects have and > have not been Unchecked-Deallocated. True in the "regular" case. However, as we have finalization capability for the access types, we may know that all objects in a pool will get unchecked deallocated before the pool finalizes. Although this may be hard for a compiler to infer this property (as it can hardly "see" that the finalization is going to ensure deallocation), a user may definitely write his own storage pool that does a null finalization. Or even better, it could verify that all objects actually were deallocated to validate the assumption above (checking an count of allocated objects). I would therefor insist that the double links are a property of the pool implementation and not really part of the type itself. Adding comments for your clarification, in the canonical LSN-021 model, type D would look like : *--------------------------* A ---> | Next_Controlled_Object | -> double linked on the heap *--------------------------* <- | Storage_Pool_Follower | *--------------------------* -- Possible implementation of storage pool finalization *--------------------------* ---- | First_Component | single linked components | *--------------------------* -- The canonical LSN-021 implementation of a dynamic object. | *--------------------------* | | Controlled_A'TAG | <---- must become a tagged type | *--------------------------* | --> | Next_Controlled_Object | -| | single link to next component *--------------------------* | | Storage_Pool_Follower | -| | Used in the heap only *--------------------------* | -- The canonical LSN-021 implementation of a controlled component *--------------------------* | NEXT | the real information *--------------------------* *--------------------------* | | Access_discriminant | ----- To know what to finalize. *--------------------------* -- The contribution of the generic solution Against the PC based model : *--------------------------* A ---> | Next_Controlled_Object | -> double linked on the heap *--------------------------* <- | Storage_Pool_Follower | *--------------------------* -- Possible implementation of storage pool finalization *--------------------------* | NEXT | the real information *--------------------------* > requiring the insertion by the compiler of implicit exception and > abortion/ATC handlers ... Regarding abortion/ATC I would like to echo a private comment from Brian Dobbing, saying that he envisaged a change in his runtime implementation where he needed to wake up the task being aborted to make it execute the finalization code, as this code may reference local context (perhaps rare, but can be done). So, finalization has already impacted this part of the job, and I imagine that treating abortion very similar to an exception would take care of both problems. > ... requiring the insertion by the compiler of implicit exception and > abortion/ATC handlers ... in the middle of expressions, declarative > parts, ... aggregates, etc. Yes, this requires the ability to insert frames in expression contexts. In fact, I believe some compilers do this already when inlining function calls. declare function F ( ) return CONTROLLED_T is ...; begin ... P ( F ( X ) ); ... end; would need to be transformed into something like : declare ... begin ... declare TMP : CONTROLLED_T; begin T'INITIALIZE ( TMP ); TMP := F ( X ); -- Call F and "move" not-finalized result in TMP, can't work for -- self-referencing stuff.. F ( TMP, X ) -- Call F and let it use user-defined assignment to copy result to -- TMP before finalizing the locals. Works for self-referencing -- stuff, and simplies the code of F. -- I'me not really in favor of the 3rd alternative that implies -- holding F's stack frame until after the call to P, which is even -- more complex than the 2 above, and must be done at a late stage of -- code generation. P ( TMP ); T'FINALIZE(TMP); exception when others => T'FINALIZE ( TMP ); -- The exception handler to be generated in the middle of ... end; end; I don't think that a transformation as shown above is that difficult to do (when comparing to the function return of self-referential objects). > ... subprogram calls (remember that access values must be passed by > copy)... Although the above applies here too, the alternative I'me thinking about would restrict the use of user- defined assignment and finalization to non-scalar types, and require them to be passed by reference (as inherently limited types). (92-1165.a). > assignment statements Here one could *require* user-defined assignment together with user-defined finalization (such as LSN-1046 strongly suggests), then the code generated for A := B; becomes T'ASSIGN ( A, B ) -- Both passed by reference. without any need for finalization (other than the one discussed above in case B is an expression). > You are still stuck with a limited type in this case. For me that's a bigger concern than the "greedy" canonical model. > In any case please believe that we are not against your developing a > ground swell in favor of full finalization and user-defined > assignment. Thanks. > Given our own sense that it is more implementation effort than most > Ada vendors are willing to accept, If that sense becomes fact, then it's an entirely different matter. My concern is only that the various implementation strategies have been insufficiently explored, such that we can't speculate on the cost of the cheapest yet. > We would rather have the reviewers battle this one out themselves > before getting us into the middle of the debate. Fair enough. I hope it is still O.K. to use the ada9x-mrt mailing list to try to find someone who is willing to oppose my viewpoint. I believe the "full" support is the only acceptable solution. Unless someone is willing to point out what exactly is too costly with it (not counting vague notions such as cans of worms), I will give this all the weight I can. In particular, I would appreciate some hints why 92-1630.a would be infeasible or hard to implement. ------------- !topic LSN on Finalization in Ada 9X !from R.R. Software (Randy Brukardt) 1992-10-23 19:24:41 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion I have made it no secret that I would like to see a user-defined assignment capability. I have also supported investigating other approaches to finalization. However, I have not seen one that is better than the current MRT proposal, and it can be improved by adding user-defined assignment and removing the weird function rules (eliminating the real problems of limitedness), without needing additional changes. >Although the above applies here too, the alternative I'm thinking >about would restrict the use of user- defined assignment and >finalization to non-scalar types, and require them to be passed by >reference (as inherently limited types). (92-1165.a). I agree. But earlier you said... >However, as we have finalization capability for the access types... No, we don't. Access types are a kind of scalar type. (OK, I know that's not strictly true. However, finalization should be defined only on 'composite' types; 'elementary' types are very similar for a compiler, and should be all treated the same. Separating access types from their natural brethren would require a lot of changes in existing compilers. [Certainly in our compiler!]) >In particular, I would appreciate some hints why 92-1630.a would be >infeasible or hard to implement. As I like to say, there is almost nothing that is impossible to implement, but there are a lot of things that are too hard to implement. So let's discuss reasons why 1630 would be too hard to implement. (I assume in my analysis that finalization will be commonly used, and will often occur in components of composite types. I think this assumption is safe, since even simple things like the File_Types of Ada 83 will undoubtably be finalized) 1) It assumes that compilers are prepared to handle hundreds or even thousands of exception handlers in a single subprogram. I would expect that many compilers have upper limits on the number of handlers that can occur in a subprogram. Even those that don't may have algorithms tuned for a few handlers. Suddenly dumping hundreds of handlers on them will cause them to break down, thus requiring recoding. The effect is that compilers will have to be redesigned even if the basic algorithms stay the same. 2) It assumes that a PC-map implementation of exceptions is used. A non-PC map implementation of exceptions probably uses a stack frame of some kind for each handler. The run-time cost of hundreds of handlers would be prohibitive. The problem with the PC-map implementation is that it assumes that the PC of the exact point where the exception occurred is available. However, that is not true for many real machines when hardware traps are considered. It is even more true for implementations that run on top of a operating system. For instance, a portable UNIX runtime cannot determine anything about the location of a hardware trap (as UNIX provides no information about the location of signals). Therefore, a PC-map implementation cannot be used. 3) It requires the compiler to expand all initialization and finalization actions inline. The need to insure that components are finalized even if the entire object is not initialized means that all of the actions must be expanded inline. This prevents a compiler from using 'thunks' to do these actions. Using 'thunks' has some advantages - the code size is less; the complexity is localized better (both in the compiler and in the object code); and the amount of stuff which needs to be stored in the symboltable is less (because some things are not needed after the 'thunks' are built). 4) The net memory usage from the implementation suggested in 1630 is much higher than the MRT's proposal. In saving 2 pointers per object (or component), you are willing to expend hundreds of bytes of code (and possibly some data) for each of those objects (or components). That appears to me to be a lousy trade-off in most circumstances. Essentially, there appears to be no reason to expect a PC-map implementation as the preferred one. In any event, a pointer-based implementation (like the MRT's) ought to be available as a possible implementation scheme for any proposal. If the proposal also allows a PC-map scheme to be used, that's fine by me. >Yes, this requires the ability to insert frames in expression >contexts. In fact, I believe some compilers do this already when >inlining function calls. I certainly hope that this does not become necessary. A central tenet of our intermediate form is that branching in expressions is very limited. The restrictions make optimization much easier (and the results better) [and also was one of the driving forces behind using 'thunks' to do record operations]. Allowing compilers to do this is one thing, but forcing them to do it is another. It is likely to have far-reaching effects both on the optimizers themselves, and the quality of the optimized code. >Here one could *require* user-defined assignment together with >user-defined finalization (such as LSN-1046 strongly suggests), then >the code generated for > >A := B; > >becomes T'ASSIGN ( A, B ) -- Both passed by reference. > >without any need for finalization (other than the one discussed above >in case B is an expression). This is a good idea, because it also helps clean up the function return case. However, you also have to consider the case of assignments used in initialization expressions. These require somewhat different semantics (since the left-hand side may not have been initialized yet). Still, the use of reference semantics for inherently limited types eliminates one of the problems with user-defined assignment: how do we deal with implicit copy is parameter passing? I suspect the groundswell for user-defined assignment and better finalization is there. And I, at least, find these capabilities far more important that OOP-style polymorphism. Randy. ------------- !topic LSN on Finalization in Ada 9X !from Tucker Taft 1992-10-26 11:51:04 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion > > There is no need to use the generic if the type is an extension of > > System.Controlled. The generic was a way to demonstrate how to add > > finalization as part of a type extension > > Thank you for correcting this mistake, I wanted to look at the implementation > model when using the generic, so the example should be : We have never proposed a generic, though we have shown how one could be implemented using System.Controlled. > type D; > type A is access D; > > procedure INITIALIZE (OBJECT : in out A); > procedure FINALIZE (OBJECT : in out A); > > package I is new CONTROLLED ( T, INITIALIZE, FINALIZE ); I presume you mean "(A, INITIALIZE, FINALIZE)" > type D is record > NEXT : I.CONTROLLED_T; > -- Normally VALUE would be called NEXT and we would find > -- some DATA here... > end record; > > > > Once you get rid of the redundant generic, some of the above goes away. > > I expect only the Access_discriminant would go away. I don't know what your generic does, so it is a little hard to guess. But let's assume that it's spec is: generic type T is limited private; with procedure Initialize(X : in out T) is <>; with procedure Finalize(X : in out T) is <>; package Controlled is type Controlled_T is limited private; private -- implementation defined -- here is a possible implementation using System.Controlled: type Controlled_T is new System.Controlled with record Obj : T; end record; -- Note: no access discriminant is required in this -- case since neither the formal type T nor the -- exported type Controlled_T is tagged. The purpose -- of the access discriminant was to reexport a private -- extension of the formal type while adding finalization. -- Here we are exporting a brand new type, so no -- access discriminant is needed. end Controlled; The implementation of System.Controlled is also implementation-defined, but if you follow the suggestion from LSN-021, then it imposes the overhead of a tag, and two links. > > though for objects in the heap some kind of linked list seems > > inevitable, since the PC can't tell you which heap objects have and > > have not been Unchecked-Deallocated. > > True in the "regular" case. > > However, as we have finalization capability for the access types, we > may know that all objects in a pool will get unchecked deallocated > before the pool finalizes. Although this may be hard for a compiler > to infer this property (as it can hardly "see" that the finalization > is going to ensure deallocation), a user may definitely write his own > storage pool that does a null finalization. Finalization is not the responsibility of storage pool managers. Finalization happens before the storage pool manager gets control, both on Unchecked_Deallocation, and at access-type scope exit. > . . . Or even better, it could > verify that all objects actually were deallocated to validate the > assumption above (checking an count of allocated objects). > > I would therefor insist that the double links are a property of the > pool implementation and not really part of the type itself. No. As mentioned above, a storage pool manager is "unaware" of whether any of their pools contain designated objects with one or more controlled "parts," and in any case is not responsible for finalizing the objects in its pools. > Adding comments for your clarification, in the canonical LSN-021 model, > type D would look like : > > *--------------------------* > A ---> | Next_Controlled_Object | -> double linked on the heap > *--------------------------* > <- | Storage_Pool_Follower | > *--------------------------* > -- Possible implementation of storage pool finalization > > *--------------------------* > ---- | First_Component | single linked components > | *--------------------------* > -- The canonical LSN-021 implementation of a dynamic object. LSN-021 does not have the components of a given object linked together in their own chain. There is just one chain of controlled objects, independent of whether they are components or stand-alone objects. My earlier response (92-1638.a) showed what the structure would look like, based on LSN-021. > | *--------------------------* > | | Controlled_A'TAG | <---- must become a tagged type > | *--------------------------* | > --> | Next_Controlled_Object | -| | single link to next component > *--------------------------* | > | Storage_Pool_Follower | -| | Used in the heap only > *--------------------------* | > -- The canonical LSN-021 implementation of a controlled component > > *--------------------------* > | NEXT | the real information > *--------------------------* > > *--------------------------* | > | Access_discriminant | ----- To know what to finalize. > *--------------------------* > -- The contribution of the generic solution There is no need for an access discriminant in a generic that doesn't take a tagged type and doesn't reexport a tagged type. So, as explained in 92-1638.a, the total size of any object of type Controlled_T would be a tag, two links, and the content (one access value, in this case). Your picture above has about twice as much overhead as is necessary. > Against the PC based model : > > *--------------------------* > A ---> | Next_Controlled_Object | -> double linked on the heap > *--------------------------* > <- | Storage_Pool_Follower | > *--------------------------* > -- Possible implementation of storage pool finalization > > *--------------------------* > | NEXT | the real information > *--------------------------* > > > requiring the insertion by the compiler of implicit exception and > > abortion/ATC handlers ... Now we see that for objects in a heap, we are talking about 2 words of overhead versus 3. Also, you are presuming that each kind of finalizable object is on its own list, since you provide no type tag to identify the finalizable object's finalization operation. With our current proposal, all controlled heap objects of any type that are associated with access types that go out of scope at the same point may be on the same list, since each controlled object has a tag that identifies its Finalize operation. > Regarding abortion/ATC I would like to echo a private comment from > Brian Dobbing, saying that he envisaged a change in his runtime > implementation where he needed to wake up the task being aborted to > make it execute the finalization code, as this code may reference > local context (perhaps rare, but can be done). So, finalization has > already impacted this part of the job, and I imagine that treating > abortion very similar to an exception would take care of both > problems. One of the great advantages of the linked list approach is that finalization can be done without doing a stack walkback. It also makes it very cheap to determine whether or not any finalization needs to be performed, by checking whether the linked list is null (or the same as it was when the async-select started). This is what allows ATC and finalization to work together with minimal distributed overhead. You should be careful not to presume that one implementor's approach is satisfactory for all. We worked hard to address the concerns of those mostly worried about ATC/abort responsiveness when designing the finalization approach. Our current approach allows an implementation to bring the distributed overhead of finalization down to a negligible level, using the linked list and making a single pointer comparison to determine that no finalization needs to be done prior to performing an ATC/abort. > > ... requiring the insertion by the compiler of implicit exception and > > abortion/ATC handlers ... in the middle of expressions, declarative > > parts, ... aggregates, etc. > > Yes, this requires the ability to insert frames in expression > contexts. In fact, I believe some compilers do this already when > inlining function calls. > . . . > I don't think that a transformation as shown above is that difficult > to do (when comparing to the function return of self-referential > objects). Again, it is very hard to assess implementation difficulty in the abstract. It would be important to have real implementors commenting on this. What looks like a "straightforward transformation" on paper may simply not fit at all into the normal implementation strategy. The generation of code for aggregates in particular is already quite complex. To create an aggregate for a discriminated type with one or more finalizable components in a "careful order" that ensures that anything that gets initialized will also get finalized, and nothing more, based on a PC-map, could be very difficult in certain compiler implementations. > . . . > > Given our own sense that it is more implementation effort than most > > Ada vendors are willing to accept, > > If that sense becomes fact, then it's an entirely different matter. > My concern is only that the various implementation strategies have > been insufficiently explored, such that we can't speculate on the cost > of the cheapest yet. There generally is no single "cheapest" approach. It depends on the structure of the implementor's compiler and their run-time model. The approach based on System.Controlled allows the implementor to use a PC-map approach if they want to, but doesn't preclude an approach based on per-object links. If we allow finalization for most non-limited private (record) types, we are effectively precluding per-object overhead. > > We would rather have the reviewers battle this one out themselves > > before getting us into the middle of the debate. > > Fair enough. I hope it is still O.K. to use the ada9x-mrt mailing list > to try to find someone who is willing to oppose my viewpoint. > > I believe the "full" support is the only acceptable solution. Unless > someone is willing to point out what exactly is too costly with it > (not counting vague notions such as cans of worms), I will give this > all the weight I can. > > In particular, I would appreciate some hints why 92-1630.a would be > infeasible or hard to implement. Here is one of the most costly and complex situations, namely one involving an unconstrained record object with finalizable components. Although in your original comment on user-defined equality and assignment you chose to disallow user-defined assignment on discriminated types, you can't (without creating lots of contract model problems) preclude it from being a component of a discriminated type, which might have defaults for its discriminants, as follows: type Dyn_Str is private; -- Presume this has user-defined assignment -- and finalization. type Str_Array is array(Positive range <>) of Dyn_Str; subtype Arr_Length is Integer range 0..30; type Str_Seq(Len : Arr_Length := 0) is record Arr : Str_Array(1..Len); end record; X, Y : Str_Seq; begin . . . X := Y; In this assignment, presume that X.Len /= Y.Len before the assignment, and hence some new Dyn_Str components of X are being created or destroyed. Implementing the assignment "X := Y" is going to be very difficult using a PC-map based approach while ensuring that all initialized components get finalized, and no others. It also is difficult to implement using just a T'ASSIGN operation, since the number of Dyn_Str components might not be the same on the left and the right, so you have to finalize at least those components of X that are "going" away (if initially X has more components than Y), or default initialize the new components of X (if initially X has fewer components than Y), prior to assigning to them. The number of different states that "X" goes through during this assignment is relatively large, making it quite painful to capture all of this faithfully just using the PC-map. One can of course create even worse nightmares, involving further composition of variant records, nested arrays, and unconstrained defaulted discriminants. However, presumably once the implementor figures out how to do it for a few of the nastiest cases, the rest will follow by composition. Nevertheless, this is one of the "can of worms" that we were talking about, and was one of the reasons we have restricted our attention to limited types, where composability of assignment is not required. ASSIGNMENT FOR LIMITED TYPES? Despite this doom and gloom about user-defined assignment for non-limited types, we have been thinking about some way to allow user-defined assignment for limited types. This would also allow initialization of such limited types, and might simplify the implementation of function return of such types. Our current thoughts are gravitating toward the "CLONE" procedure we suggested a while ago, and we will try to produce an LSN on this and related user-defined assignment issues sometime this week. The big advantage of "CLONE" vs. "ASSIGN" is that it works well when the left-hand side was uninitialized, which would be the case for initialization and function return. In any case, stay tuned... -Tuck ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-26 22:00:50 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion > However, I have not seen one that is better than the current MRT > proposal, and it can be improved by adding user-defined assignment and > removing the weird function rules (eliminating the real problems of > limitedness), without needing additional changes. For me, the real problem of limitedness is that it takes away a lot of language features of all types that have a limited type in their "closure", and unless this was designed to happen right from the start, it is very disruptive. So, adding user-defined assignment should take away the requirement for limitedness to get finalization. > Access types are a kind of scalar type...However, > finalization should be defined only on 'composite' types I agree, I meant restricted to non-elementary types (in fact 1165.a sais record types without discriminants, just to set a marker for acceptability of restrictions). > >However, as we have finalization capability for the access types... > No, we don't. I agree we don't have this capability directly. What I meant was that if we encapsulate an access type in a (limited) private type, then we have finalization on the encapsulated access type indirectly, assuming we don't spill memory internally in the package. Therefor, we may want to "optimize" the heap of that access type to do no global heap finalization. A library level access type is a cleaner example. Any overhead in by- heap-finalization is a waste of resources as the heap will exist as long as the environment task, and memory resources are normally released automatically when that task terminates. Here we may have another reason to avoid the 2 heap related pointers. > I assume in my analysis that finalization will be commonly used. A bit dangerous, I'me afraid that with the limited penalty, it may not be used in many cases where it should. > 1) It assumes that compilers are prepared to handle hundreds or even > thousands of exception handlers in a single subprogram. I hope not. The typical use of finalization(/assignment) would be protecting the access types that one needs in so many occasions to get rid of the limitations on unconstrained, limited or class-wide types. For a list of declarations of such encapsulated access types, e.g. declare PTR1 : T; PTR2 : T := F; PTR3 : T := PTR2; begin CODE_AFTER_ELABORATION; ... end; we could have code generated : declare PTR1 : T; PTR2 : T; PTR3 : T; begin T'INITIALIZE ( PTR1 ); T'INITIALIZE ( PTR2 ); T'INITIALIZE ( PTR3 ); -- Now all may be finalized, logically initialized or not F ( PTR2'ACCESS ); T'ASSIGN ( PTR3'ACCESS, PTR2'ACCESS ); -- Now logically elaborated CODE_AFTER_ELABORATION; exception when others => T'FINALIZE ( PTR1 ); T'FINALIZE ( PTR2 ); T'FINALIZE ( PTR3 ); raise; end; This does of course rely on nobody being shocked by a T'INITIALIZE call that happens early, or that wouldn't have happened if we used a T'COPY appoach (where T'COPY assigns to uninitialized memory as opposed to T'ASSIGN that always assigns to initialized memory). Yes, if T raises an exception then we have a T'INTIALIZE and T'FINALIZE too much on the PTR3 that should never have elaborated. Is that a real problem? From a user's point of view I think not. Calling T'INITIALIZE prior to any explicit *or implicit* assignment is just a matter of defining what T'INITIALIZE means. Moving the invokations around is in the same spirit as 11.6. > I would expect that many compilers have upper limits on the number of > handlers that can occur in a subprogram. Yes. As a result 9X compilers may have upper limits on the use of finalization. I hope we can live with this. I agree compilers don't always live up to user's expectations. But if we should remove all language features where some compilers have limitations, we would end up with a rather small language. Count on programmers being used to split a subprogram in smaller pieces to work-around compiler limits (or QA code size limits). By the way, I checked this limit on Alsys5.5.0 on HP9000-400, and there it is around 550. I expect this to be sufficient. > 2) It assumes that a PC-map implementation of exceptions is used. If I can make the transformation above by hand, and express the results in semi-pure Ada (Simply replace T' by T_ and it is pure Ada), then clearly the only requirement is an Ada83 compiler that works correctly, no matter how it is implemented. > 3) It requires the compiler to expand all initialization and finalization > actions inline. As illustrated above, I wouldn't mind getting any finalizable object or component get INITIALIZED before the enclosing object gets (logically) used. Alternatively, you may want to create T'ELABORATE procedures that elaborate any object of type T, calling INITIALIZE in the absence of default initial values, T'COPY where present, and 1 exception handler per index (for arrays) or component (for records). > This prevents a compiler from using 'thunks' to do these actions. Sorry, I don't know what 'thunks' are. Again, if we could model the finalization support in Ada (or close), I don't think we have to mind too much about implementation detail. > 4) The net memory usage from the implementation suggested in 1630 > is much higher than the MRT's proposal. Again, I think the typical use is with access types. The memory used by exception handlers is a fixed price to pay, the memory consumed by dynamically allocated objects is proportional to their quantity. The systems I know that use significant amounts of memory do this primarily on the heap, and that is therefor the determining factor for memory usage. Compare to SmallTalk and Lisp applications that use almost exclusively dynamic memory. > In saving 2 pointers per object (or component)... I think the typical case is not far from the worst case (finalizable encapsulted access types that designate small composite types which again primarily contain those encapsulated access types). The worst case is the generic finalization using SYSTEM.CONTROLLED (92-1334.a) against the non-limited finalization on an encapsulated library level access type, i.e. an 8:1 overhead per dynamic allocation (if you agree with the supressed by-heap links in this case). > If the proposal also allows a PC-map scheme to be used, that's > fine by me. For me too if you find a way to get rid of the limitedness restriction. What I'me most concerned about is the non-uniformity. Putting it extremely, Ada9X would be fine if *all* private and composite types were limited, such that all these nice compiler features couldn't tempt anyone to forget about finalization and assignment control when he really shouldn't, or to declare a generic formal non-limited type that would pose a barrier to change to limited types elsewhere. > >Yes, this requires the ability to insert frames in expression > >contexts. > > I certainly hope that this does not become necessary. > Allowing compilers to do this is one thing, but forcing them to do it > is another. It is likely to have far-reaching effects both on the > optimizers Ok, it may well be asking for too much. I think the approach illustrated above, where a compiler can use T'INITIALIZE early, would solve this problem. What if you add the (for me quite acceptable) restriction that exceptions raised from T'INITIALIZE may cause other only-initialized objects not to become finalized. In other words : *don't* raise exceptions from T'INITIALIZE, (such as in "*don't* raise exceptions in protected type guards or interrupt handlers...") (Again I think of the typical case of access values that are set to null, and the general case that can be dealth with by a boolean component marking the object as not logically initialized). > >A := B; > > > >becomes T'ASSIGN ( A, B ) -- Both passed by reference. > > This is a good idea, because it also helps clean up the function return > case. Thank you. > However, you also have to consider the case of assignments used in > initialization expressions. These require somewhat different > semantics (since the left-hand side may not have been initialized > yet). How about doing T'INITILIZE early and T'ASSIGN later (as above). Considering the inevitable overhead of FINALIZATION, putting a null or FALSE value in an object can hardly make the difference. > I suspect the groundswell for user-defined assignment and better > finalization is there. And I, at least, find these capabilities far > more important that OOP-style polymorphism. I couldn't agree more. > Randy. Stef. ------------- !topic LSN on Finalization in Ada 9X !from Bob Duff 1992-10-27 09:21:58 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1697.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion > Sorry, I don't know what 'thunks' are. Stef, A "thunk" is a compiler-generated subprogram containing code written by the user. I think it comes from Algol 68 (or 60?) call-by-name, where if an actual parameter was, say, A[I], the compiler would wrap that up in a function, and pass the function as the parameter, so the indexing operation would get re-evaluated on each use. That function was called a "thunk" -- I don't know why. Randy's compiler puts the record initialization code in a thunk. - Bob ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-27 16:01:27 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion Thank you for clarifying my misunderstandings. > We have never proposed a generic, though we have shown how one could be > implemented using System.Controlled. I was referring to your comment 92-1334.a : 1334> If you will allow me, here is the implementation of this package 1334> in Ada 9X: 1334> 1334> generic 1334> type T is tagged limited private; 1334> with procedure INITIALIZE (OBJECT : in out T); 1334> with procedure FINALIZE (OBJECT : in out T); 1334> package CONTROLLED is 1334> type CONTROLLED_T is new T with private; 1334> private 1334> type MIX_IN(OBJ : access CONTROLLED_T) is 1334> new SYSTEM.CONTROLLED with null; 1334> procedure INITIALIZE(MIX : in out MIX_IN); 1334> procedure FINALIZE (MIX : in out MIX_IN); 1334> 1334> type CONTROLLED_T is new T with record 1334> MIX : MIX_IN(CONTROLLED_T'ACCESS); 1334> end record; 1334> end CONTROLLED; > Finalization is not the responsibility of storage pool managers. > Finalization happens before the storage pool manager gets control, > both on Unchecked_Deallocation, and at access-type scope exit. Sorry about my false assumption. That's a disappointment though. So even when we know this overhead is not ever going to serve any useful purpose (e.g. when we know that the pool will be empty when it is finalized), we have to pay for it. I'me afraid this is going to be a very typical case for library level access types encapsulated in a controlled object. I thought that the user-defined storage pools were added as a feature in 9X, precisely because the default pools always give you all the functionality with all the overhead. In this case you should shift a *maximum* of this functionality into the user's hands, otherwise he has again good reason to ask for a user-defined storage-pool finalizer in 0X. > My earlier response (92-1638.a) showed what the structure would look > like, based on LSN-021. Sorry Tucker, I got confused by you reply : 1638> What follows is what I would expect for a component 1638> declared as "Next : T", ^^^^^^^^^ While I was trying to figure out the map of a *dynamic object* that only contains that component, which I (wrongly) expected to be very different. > LSN-021 does not have the components of a given object linked > together in their own chain. When I read LSN-021 : LSN021> Unchecked_Deallocation would finalize the subcomponents and then LSN021> the object as a whole which is about to be deallocated. I understood that Unchecked_Deallocation could do this without knowing the type that is being finalized, following a link that chains all to-be-finalized components. I was just assuming that if you created these links between components (rather than between objects), you really wanted to avoid Unchecked_Deallocation to depend on availability of type descriptor information or generated composed finalization procedures. LSN021> When allocating a controlled object (A), or one which contains LSN021> controlled subcomponents (B), the controlled objects would be LSN021> added to the per-storage-pool finalization list (PSPFL). So "the controlled objects" means "the controlled object" in case (A) and "only the controlled subcomponents *without* the object itself" in case (B). As the paragraph starts with a clear distinction between objects and subcomponents, I interpreted this as "the controlled (dynamic) object, but not the subcomponents" in both cases. To make absolutely sure that I correctly understand the implementation model in LSN-021, considering only dynamic objects : A := new D; becomes : 1. create a dummy header that forms a double linked list on its own. 2. allocate the memory 3. use a (composed) initialization procedure that will visit each component and : 3.a call INITIALIZE for that component 3.b insert the component in the double linked list 4. If successful, merge the double linked list around the dummy header in the heap's finalization list. 5. When an exception is raised, finalize each component in the double linked list around the dummy header. FREE(A); becomes : 1. use a (composed) finalization procedure that will visit each component and : 1.a call FINALIZE on that component 1.b in case of exception, raise a flag that marks this event 1.c remove the component from the double linked list 2. deallocate the memory 3. in case the flag is set, raise a an exception So, both elaboration and unchecked_deallocation must be able to find the possibly-to-be-finalized components, and the links are only necessary in case of finalizing half-elaborated objects. And compilers need to generate composed INITIALIZE and FINALIZE procedures. Then I think we pay a very heavy price for a very rare event. (Considering that most guidelines recommend to use exceptions to flag programming errors, rather than using them as another control flow primitive, I consider partial elaboration a rare event). > Your picture above has about twice as much overhead as is necessary. You're right that the storage pool overhead can be considered a separate problem. You're equally right about the fact that there will be only 1 pair of links for the component and none for the object itself. I was looking for the worst case, i.e. the tagged type that is already derived from some non-controlled root, such that the generic above has to be used, such that you need the access discriminant too. > Now we see that for objects in a heap, we are talking about > 2 words of overhead versus 3. O.K. The "functionally visible" damage however stays, i.e. the fact that the links make the object self-referential, and hence limited and difficult-to-move for function return. > One of the great advantages of the linked list approach is > that finalization can be done without doing a stack walkback. I agree, but I just find the final user-impact of self-referencing without user-defined assignment too large a cost to justify this advantage. > Our current approach allows an implementation to bring the distributed > overhead of finalization down to a negligible level, using the linked > list and making a single pointer comparison to determine that no > finalization needs to be done prior to performing an ATC/abort. In case of the exception-handler analogy you would need 1 jump to the handler corresponding to the ATC construct (as there would be no intermediate handlers). Is that too costly ? If people find the idea of creating a separate task for ATC acceptable, surely they would find a single exception handling operation acceptable too. Finally, I thought this performance would be no longer crucial, as the current approach suggests that the statements following the triggering statement execute *before* the finalization of the abortable part. So whatever is more urgent than finalization can precede it. > Again, it is very hard to assess implementation difficulty in the > abstract. It would be important to have real implementors commenting > on this. Yes, in 92-1680.a Randy Brukardt complains about this too. In 92-1697.a I mentioned an alternative approach : procedure T'INITIALIZE ( OBJ : access T ); would become a procedure that the compiler can use at will to mark uninitialized objects, components or temporaries as "initialized", such that T'FINALIZE knows there is nothing to do and T'ASSIGN knows it should act as T'COPY when the left-hand parameter is marked this way. So T'INITIALIZE would mark an object as logically unexisting but physically initialized. In a typical case this would set an access type component to null, irrespective of further implicit or explicit initialization. Raising an exception from T'INITIALIZE would result in some objects that were passed to T'INITIALIZE (but not to anything else) not to become finalized. procedure T'ASSIGN ( LEFT, RIGHT : access T ); Copies the value in RIGHT to LEFT, and can count on the fact that LEFT and RIGHT were first passed to T'INITIALIZE, no matter what the context of assignment is. procedure T'FINALIZE ( OBJ : access T ); Finalizes OBJ if logically existing, and can count on the fact that and OBJ was first passed to T'INITIALIZE, no matter what the context of assignment is. In this case T'INITIALIZE should be considered as an extension to reserving memory, rather than as a replacement for implicit initialization. Implicit or explicit initialization would always follow T'INITIALIZE. > Here is one of the most costly and complex situations, namely one > involving an unconstrained record object with finalizable components. Yes, 92-1630.a didn't mean to forbid this example. > type Dyn_Str is private; -- Presume this has user-defined assignment > -- and finalization. > type Str_Array is array(Positive range <>) of Dyn_Str; > > subtype Arr_Length is Integer range 0..30; > type Str_Seq(Len : Arr_Length := 0) is record > Arr : Str_Array(1..Len); > end record; With the approach above, I would generate procedures : procedure Str_Array'Initialize ( Obj : access Str_Array ) is begin for I in Obj'Range loop Dyn_Str'Initialize ( Obj(I) ); end loop; end; procedure Str_Array'Finalize ( Obj : access Str_Array ) is begin for I in Obj'Range loop Dyn_Str'Finalize ( Obj(I) ); end loop; end; But these to have to be generated in any case, for use by the unchecked_ deallocation of dynamic objects with components of type Str_Array. procedure Str_Array'Assign ( Left, Right : access Str_Array ) is begin if Left /= Right then -- If not the same object if Left'Length /= Right'Length then raise Constraint_Error; end if; if Left < Right then -- Compiler can compare the addresses. -- End of Left may overlap begin of Right for I in Right'Range loop Dyn_Str'Assign ( Left(I+Left'First-Right'First), Right(I) ); end loop; else -- End of Right may overlap begin of Left. for I in reverse Right'range loop Dyn_Str'Assign ( Left(I+Left'First-Right'First), Right(I) ); end loop; end if; end if; end; This composed assignment works correctly for overlapping slices of Str_Array. One could go further and also generalize the "atomic" nature of predefined assignment to composed assignment, but that would make thinks more complex than they need to be. If we really want this property, I think we must forget about user-defined assignment for self-referencing stuff or other "un-movable" types. For Str_Seq we generate such procedures too : procedure Str_Seq'Initialize ( Obj : access Str_Seq ) is -- Should be called after setting discriminants begin Str_Array'Initialize ( Obj.Arr ); end; procedure Str_Seq'Finalize ( Obj : access Str_Seq ) is begin Str_Array'Finalize ( Obj.Arr ); end; procedure Str_Seq'Assign ( Left, Right : access Str_Seq ) is begin if Left /= Right then -- If not same object Str_Seq'Finalize ( Left ); Left.Len := Right.Len; -- Compiler can mutate by tag Str_Seq'Initialize ( Left.Arr ); -- Now Left is as a newly created object with Right's -- discriminants Str_Array'Assign ( Left.Arr, Right.Arr ); -- Assign all components for Right's discriminants. end if; end; I think the straightforward implementation of Str_Seq'Assign is simple, but it could also be implemented more efficiently using the optimization you suggested : procedure Str_Seq'Assign ( Left, Right : access Str_Seq ) is begin if Left /= Right then -- If not same object Syn_Str'Finalize ( Left.Arr(Right.Len+1..Left.Len)); Syn_Str'Initialize ( Left.Arr(Left.Len+1..Right.Len)); Left.Len := Right.Len; -- Compiler can mutate by tag Syn_Str'Assign ( Left.Arr, Right.Arr ); end if; end; > X, Y : Str_Seq; > begin > . . . > X := Y; becomes : X, Y : Str_Seq; begin Str_Seq'Initialize(X); Str_Seq'Initialize(Y); -- No finalization until here begin ... Str_Seq'Assign(X, Y); exception when others => Str_Seq'Finalize (X); Str_Seq'Finalize (Y); end; end; When comparing this technique with the linked components one would otherwise have, this uses less memory and is faster when declaring objects without explicit or implicit initialization, comparable in the presence of initialization, and slower when elaboration gets interrupted by an exception (as T'Finalize is called for components that would not have been in the to-be-finalized lists). > using a PC-map based approach while ensuring that all initialized > components get finalized, and no others. I think the above greatly simplifies the problem you and Randy care so much about : that the PC-based approach needs to know the context with a low granularity. To conclude your example, I would like to invite you to also consider the cost and risk for errors if each programmer needed to implement your example by hand, making the discriminant a regular component, and adding a procedure ASSIGN and function "=". > ASSIGNMENT FOR LIMITED TYPES? > > Despite this doom and gloom about user-defined assignment for > non-limited types, we have been thinking about some way to allow > user-defined assignment for limited types. I hope the above undooms it a little. What's the point of paying this (big) price if you get close to nothing in return ? Due to the availability of assignment the type should become no longer limited, otherwise we still have 90% of the problems left. > Our current thoughts are gravitating toward the "CLONE" procedure we > suggested a while ago, and we will try to produce an LSN on this and > related user-defined assignment issues sometime this week. The big > advantage of "CLONE" vs. "ASSIGN" is that it works well when the > left-hand side was uninitialized, which would be the case for > initialization and function return. Could this CLONE be used to remove limited-ness too ? Is the overhead of using T'INITIALIZE and T'ASSIGN as introduced above important? The typical case would be putting "null's" in each controlled component, while currently each such component must get a double linked list insertion done. > -Tuck Stef ------------- !topic LSN on Finalization in Ada 9X !from R.R. Software (Randy Brukardt) 1992-10-28 20:22:30 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1697.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion >> However, I have not seen one that is better than the current MRT >> proposal, and it can be improved by adding user-defined assignment and >> removing the weird function rules (eliminating the real problems of >> limitedness), without needing additional changes. > >For me, the real problem of limitedness is that it takes away a lot of >language features of all types that have a limited type in their >"closure", and unless this was designed to happen right from the >start, it is very disruptive. So, adding user-defined assignment >should take away the requirement for limitedness to get finalization. My theory has been that many useful types in Ada 9x programs will be limited. Therefore, the sorts of situations that you are concerned about (generic privates not allowing limited, assignments of limited, etc.) will be much less common. In addition, proper composition of operations (particularly assignment and equality) would be a big help. However, those open up the semantic problems Tucker mentioned, so they probably will be left for Ada 0x. I don't want to get too greedy and end up with nothing. >> Access types are a kind of scalar type...However, >> finalization should be defined only on 'composite' types > >I agree, I meant restricted to non-elementary types (in fact 1165.a >sais record types without discriminants, just to set a marker for >acceptability of restrictions). > >> >However, as we have finalization capability for the access types... > >> No, we don't. > >I agree we don't have this capability directly. What I meant was that >if we encapsulate an access type in a (limited) private type, then we >have finalization on the encapsulated access type indirectly, assuming >we don't spill memory internally in the package. Therefore, we may >want to "optimize" the heap of that access type to do no global heap >finalization. > >A library level access type is a cleaner example. Any overhead in by- >heap-finalization is a waste of resources as the heap will exist as >long as the environment task, and memory resources are normally >released automatically when that task terminates. Here we may have >another reason to avoid the 2 heap related pointers. This is true if finalization is only used to clean up memory resources. However, if it is used to release other resources (files, locks, devices), it can be quite necessary for the finalization to occur. Consider a type which represents a tty port in UNIX. If that port has been set into raw mode, it had better get set back to cooked mode before the OS regains control. (UNIX does not reset device drivers when a program ends). Since there can be many tty devices used in a program, using a finalized object accessed with an access type is quite reasonable. >> I assume in my analysis that finalization will be commonly used. > >A bit dangerous, I'm afraid that with the limited penalty, it may not >be used in many cases where it should. If finalization is almost never used, we A) wasted our time on it in the first place, and B) the performance of the technique used is irrelevant. >> 1) It assumes that compilers are prepared to handle hundreds or even >> thousands of exception handlers in a single subprogram. > >I hope not. > >The typical use of finalization(/assignment) would be protecting the >access types that one needs in so many occasions to get rid of the >limitations on unconstrained, limited or class-wide types. That is a very narrow view of the use of finalization. I intend to use it on just about any resource that needs to be freed: 1) On Text_IO.File_Type, so files get closed when their objects go away. (This would eliminate some garbage in our runtime which insures it gets done before the program terminates). 2) On JWindows.Window, so that windows whose objects cease to exist also cease to be displayed on the screen, and so the data structures they use can be recovered. 3) On SLock.File_Locks, so that a exception being raised in an unanticipated place does not leave a file locked forever, and to eliminate many of the exception handlers currently inserted specifically to insure that files get unlocked but do nothing else. [All of these things already happen to be limited, because predefined assignment is inappropriate.] Any of these things could occur in an array. Since each finalizable component has to be initialized separately, that can lead to a statically unknown set of handlers being required. >For a list of declarations of such encapsulated access types, e.g. > > declare > PTR1 : T; > PTR2 : T := F; > PTR3 : T := PTR2; > begin > CODE_AFTER_ELABORATION; > ... > end; > >we could have code generated : > > declare > PTR1 : T; > PTR2 : T; > PTR3 : T; > begin > T'INITIALIZE ( PTR1 ); > T'INITIALIZE ( PTR2 ); > T'INITIALIZE ( PTR3 ); > -- Now all may be finalized, logically initialized or not > > F ( PTR2'ACCESS ); > T'ASSIGN ( PTR3'ACCESS, PTR2'ACCESS ); > -- Now logically elaborated > > CODE_AFTER_ELABORATION; > > exception > when others => > T'FINALIZE ( PTR1 ); > T'FINALIZE ( PTR2 ); > T'FINALIZE ( PTR3 ); > raise; > end; This code is wrong. To see why, consider the possibility that the second call to T'Initialize may have raised Storage_Error, or that the task containing this block is aborted at that point. (Aborts can be thought of as a special kind of exception). In that case, only PTR1 can be finalized. You also forgot to finalize them in the normal case. Once these are corrected, your rearrangement no longer works: Thus the code must be: declare PTR1 : T; PTR2 : T; PTR3 : T; begin T'INITIALIZE ( PTR1 ); begin T'INITIALIZE ( PTR2 ); F ( PTR2'ACCESS ); begin T'INITIALIZE ( PTR3 ); T'ASSIGN ( PTR3'ACCESS, PTR2'ACCESS ); CODE_AFTER_ELABORATION; T'FINALIZE ( PTR3 ); exception when others => T'FINALIZE ( PTR3 ); raise; end; T'FINALIZE ( PTR2 ); exception when others => T'FINALIZE ( PTR2 ); raise; end; T'FINALIZE ( PTR1 ); exception when others => T'FINALIZE ( PTR1 ); raise; end; >This does of course rely on nobody being shocked by a T'INITIALIZE >call that happens early, or that wouldn't have happened if we used a >T'COPY approach (where T'COPY assigns to uninitialized memory as >opposed to T'ASSIGN that always assigns to initialized memory). > >Yes, if T raises an exception then we have a T'INITIALIZE and >T'FINALIZE too much on the PTR3 that should never have elaborated. Is >that a real problem? From a user's point of view I think not. Probably true. However, the ACVC writers will think otherwise, and some programs may care about the difference. And in any event, the real problem is T'INITIALIZE raising an exception (or being aborted). >Calling T'INITIALIZE prior to any explicit *or implicit* assignment is >just a matter of defining what T'INITIALIZE means. Moving the >invokations around is in the same spirit as 11.6. T'INITIALIZE is a user-defined procedure. It therefore can do anything, and the language must be prepared to deal with that. >> I would expect that many compilers have upper limits on the number of >> handlers that can occur in a subprogram. > >Yes. As a result 9X compilers may have upper limits on the use of >finalization. I hope we can live with this. > >I agree compilers don't always live up to user's expectations. But if >we should remove all language features where some compilers have >limitations, we would end up with a rather small language. Count on >programmers being used to split a subprogram in smaller pieces to >work-around compiler limits (or QA code size limits). > >By the way, I checked this limit on Alsys5.5.0 on HP9000-400, and >there it is around 550. I expect this to be sufficient. I don't; if it is, we've failed to define finalization usefully. >> 2) It assumes that a PC-map implementation of exceptions is used. > >If I can make the transformation above by hand, and express the >results in semi-pure Ada (Simply replace T' by T_ and it is pure Ada), >then clearly the only requirement is an Ada83 compiler that works >correctly, no matter how it is implemented. If the resulting code is too large to run on the target, the fact that the compiler can compile it is irrelevant. That would be a common problem on 16-bit targets if PC-mapping is not used. >> 3) It requires the compiler to expand all initialization and finalization >> actions inline. > >As illustrated above, I wouldn't mind getting any finalizable object >or component get INITIALIZED before the enclosing object gets >(logically) used. If this worked correctly, I would not have a problem. However, it doesn't (if T'INITIALIZE for a component raises an exception, only some components must get finalized. This cannot be done in a thunk [a compiler-generated subroutine], because some exception handlers would have to be entered, but not left, during the subroutine. >Alternatively, you may want to create T'ELABORATE procedures that >elaborate any object of type T, calling INITIALIZE in the absence of >default initial values, T'COPY where present, and 1 exception handler >per index (for arrays) or component (for records). Since you don't know how many indexes there will be, such an expansion is hard. Particularly because the new 11.6 will probably not define the values of variables written within handled code (while Ada 83 does), so using temporary variable to hold the index cannot be done (the optimizer is allowed to eliminate it). THAT can be worked around be using different rules for compiler-generated code, but that makes writing the optimizer quite a bit harder. >> This prevents a compiler from using 'thunks' to do these actions. > >Sorry, I don't know what 'thunks' are. Again, if we could model the >finalization support in Ada (or close), I don't think we have to mind >too much about implementation detail. (See above for thunks). I don't think we can quite model it in Ada, at least not correct Ada 9x. And the model would be very expensive in common cases. >> 4) The net memory usage from the implementation suggested in 1630 >> is much higher than the MRT's proposal. > >Again, I think the typical use is with access types. The memory used >by exception handlers is a fixed price to pay, the memory consumed by >dynamically allocated objects is proportional to their quantity. I understand where you're coming from, but I don't agree. First, I think most examples will use fixed finalizable objects (like File_Type, Window, etc.) Even most 'access types' will do that -- consider that all of your examples have done so. Beyond that, your scheme potentially adds a lot of code overhead to a system - probably 10 times as much as the MRT scheme. That means you don't break even until you have 10 times more finalized objects in the heap than you have finalization places. I think that will be rare in practice, particularly in embedded systems (where heap use is often forbidden). >The systems I know that use significant amounts of memory do this >primarily on the heap, and that is therefore the determining factor for >memory usage. Compare to SmallTalk and Lisp applications that use >almost exclusively dynamic memory. >> In saving 2 pointers per object (or component)... > >I think the typical case is not far from the worst case (finalizable >encapsulated access types that designate small composite types which >again primarily contain those encapsulated access types). > >The worst case is the generic finalization using SYSTEM.CONTROLLED >(92-1334.a) against the non-limited finalization on an encapsulated >library level access type, i.e. an 8:1 overhead per dynamic >allocation (if you agree with the suppressed by-heap links in this >case). I understand the worst case here. However, that case will never happen in practice. Why? Because the overhead of very small heap allocations (and the fragmentation potential) will eat you alive in nearly any real situation. That has nothing to do with finalization, just the fact the typical heap allocations have some overhead with each allocated block. The true worst case is probably under 50%, and for realistic data types like my File_Type and Window examples, is under 10%. On a hosted system where extensive heap use is allowed, that is unlikely to be enough to matter. >> If the proposal also allows a PC-map scheme to be used, that's >> fine by me. > >For me too if you find a way to get rid of the limitedness >restriction. > >What I'm most concerned about is the non-uniformity. Putting it >extremely, Ada9X would be fine if *all* private and composite types >were limited, such that all these nice compiler features couldn't >tempt anyone to forget about finalization and assignment control when >he really shouldn't, or to declare a generic formal non-limited type >that would pose a barrier to change to limited types elsewhere. As I said previously, I expect that many users *WILL* treat Ada 9x like that. That is especially true since "all those nice compiler features" really boil down to just composition of equality and assignment. Assuming used-defined assignment is available, there will be no other significant difference (initializations and aggregates would be available where appropriate, based in some way on the user-defined assignment). >> >Yes, this requires the ability to insert frames in expression >> >contexts. >> >> I certainly hope that this does not become necessary. >> Allowing compilers to do this is one thing, but forcing them to do it >> is another. It is likely to have far-reaching effects both on the >> optimizers > >Ok, it may well be asking for too much. > >I think the approach illustrated above, where a compiler can use >T'INITIALIZE early, would solve this problem. What if you add the >(for me quite acceptable) restriction that exceptions raised from >T'INITIALIZE may cause other only-initialized objects not to become >finalized. In other words : *don't* raise exceptions from >T'INITIALIZE, (such as in "*don't* raise exceptions in protected type >guards or interrupt handlers...") I think this is a fine idea, but I'm not convinced it works when a type is sufficiently complicated. For instance, it is necessary to evaluate the value of a discriminant before you can know what components to initialize. You also would have to define what happens then T'Initialize is aborted. >(Again I think of the typical case of access values that are set to >null, and the general case that can be dealt with by a boolean >component marking the object as not logically initialized). > >> >A := B; >> > >> >becomes T'ASSIGN ( A, B ) -- Both passed by reference. >> >> This is a good idea, because it also helps clean up the function return >> case. > >Thank you. > >> However, you also have to consider the case of assignments used in >> initialization expressions. These require somewhat different >> semantics (since the left-hand side may not have been initialized >> yet). > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). >Considering the inevitable overhead of FINALIZATION, putting a null or >FALSE value in an object can hardly make the difference. I tend to agree here. However, I vaguely recall some problem with this idea. Tucker? >> I suspect the groundswell for user-defined assignment and better >> finalization is there. And I, at least, find these capabilities far >> more important that OOP-style polymorphism. > >I couldn't agree more. > >> Randy. > >Stef. Randy. ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-30 22:00:55 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion <> !discussion .... > This code is wrong. I fear most of your reply depends on a misunderstanding I generated. The code you call wrong was based on : > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. I tried to convey that idea right after the example by "This does of course rely on nobody being shocked by a T'INITIALIZE call that happens early." I should have been much more explicit about re-using T'INITIALIZE for different semantics. Sorry. To be absolutely sure we don't have that misunderstanding again : SUGGESTION FOR PHASED FINALIZATION : ------------------------------------ I would propose to approach the finalization problem in 2-phases. This to give a lot of slack space for compilers that would have difficulty with the fine granularity of the finalization proposed in 92-1630.a. T'INITIALIZE would be a low-level initialization, completely unrelated to any explicit or implicit initialization. The programmer is informed that the compiler can call T'INITIALIZE without calling T'FINALIZE ! Therefor, T'INITIALIZE implementations that require finalization are as "wrong" as forgetting about finalization. The call T'INITIALIZE(OBJ) will guarantee that OBJ will be set to some "null" value, such that it can later be passed to T'ASSIGN and/or T'FINALIZE. A T'COPY operation (supposedly something used to assign to uninitialized objects) is always replaced by a T'INITIALIZE/T'ASSIGN combination. This is of course less efficient, but it gives the compiler freedom to group elaboration operations together, by first calling T'INITIALIZE on all stuff that needs finalization, and subsequently calling T'ASSIGN on all stuff that needs finalization and (normal, high-level) initialization. The first phase happens as if all components that need finalization also had implicit initialization, but rather than calling some "thunk", you call T'INITIALIZE. Therefor, you can't possibly be wondering if you already have your discriminants set, whether access discriminants will be set, how many components there are in arrays, whether implicit heap would be used, etc... all this is known at that time (as a direct consequence of the model). The only thing special about this first phase is that it doesn't need any finalization (by definition), but permits to finalize the complete elaborated objects/subcomponents (by definition). Both assumptions are the programmers responsibility. In practice it only means he must be smart enough to set a flag. Of course the size of these groups are limited by dependencies between declarations that need finalization, but that shouldn't be a serious problem, as a large amount of such dependencies is very rare. EXAMPLE : --------- declare OBJ : T := V1; begin ... OBJ := V2; end; becomes : declare OBJ : T; begin T'INITIALIZE ( OBJ ); -- Unrelated to the initialization, should be though of as associating -- with reserving the memory for OBJ. -- It would typically set OBJ to null. -- If this raises an exception nothing is done. begin -- OBJ will get finalized from now on. T'ASSIGN ( OBJ, V1 ); -- This implements the initialization. ... T'ASSIGN ( OBJ, V2 ); -- This implements the assignment. T'FINALIZE ( OBJ ); -- Normal exit exception when others => T'FINALIZE ( OBJ ); raise; -- Abnormal exit. end; end; T could be implemented as : package SIMPLE is type T is private; ... private type DESIGNATED is ... -- self-referential, inherently limited, class-wide and/or -- unconstrained, task or protected type, whatever ... type PTR_T is access DESIGNATED; type T is record PTR : PTR_T; end record; end; procedure T'INITIALIZE ( OBJ : out T ) is -- The only one called with an uninitialized actual. begin OBJ.PTR := null; end; procedure T'FINALIZE ( OBJ : in out T ) is procedure FREE is new UNCHECKED_DEALLOCATION ( DESIGNATED, T ); begin FREE ( OBJ.PTR ); end; procedure T'ASSIGN ( LHS, RHS : in out T ) is begin T'FINALIZE ( LHS ); LHS.PTR := new DESIGNATED'( RHS ); end; SOME ALTERNATIVES : ------------------- An almost equivalent approach (suggested by Bob Duff as a second less uniform choice), would be to state that for these "user controlled" types, implicit initialization always happens (first), while any explicit initialization happens later (by calling T'ASSIGN). Bob gave this as an alternative to T'COPY, but now that (IMHO) the two-phased approach seems to bring great relief to code generation, it is perhaps more attractive that the T'INITIALIZE call. In the example above, you could simply discard T'INITIALIZE as access values get null by default. In case some compilers are intelligent about implicit initializations, this alternative may be cheaper. Of course, one can feel un-easy about calls to user-defined subprograms that are not easy to predict. As long as T'INITIALIZE simply sets a null value, this can't be a concern. People who define T'INITIALIZEs that do a TEXT_IO.GET_LINE are looking for non-deterministic behavior and will get what they wanted. If this un-easy feeling is yet too strong, one may dare to consider yet another approach : A compiler would be required to "zero" the memory allocated for objects that need finalization. This would give the programmer a low-level, non-portable ability to make T'FINALIZE and T'ASSIGN work with less visible initialization. REPLY TO 92-1711.a : -------------------- The rest of the comment is a detailed reply to Randy (92-1711.a). > This is true if finalization is only used to clean up memory resources. Remember that the issue was about user-defined storage allocators (92-1697.a). Those that use user-controlled features are willing to take responsibility for the correct behavior. In the first case the programmer need guarantee finalization of all heap objects. In the second case the programmer need guarantee finalization is not needed for anything but the memory. The essential factor is that I don't expect anything from the compiler. I just assume that user-defined pools are meant for programmers that are willing to take responsibility for correctness in return for more efficiency. I see some responsibility they could take, and how they could get efficiency in return, and note how the current plans refuse him that trade. > If finalization is almost never used, we A) wasted our time on it in > the first place, and B) the performance of the technique used is > irrelevant. If finalization is almost never used, and almost always *needed*, Ada9X has a problem, as the competition can validly claim they don't have this problem. I'me convinced that the current Ada83-C++ balance is to a significant extent due to the attention paid to OO in both languages. I think history may well repeat itself by paying insufficient attention to assignment/finalization control. > That is a very narrow view of the use of finalization. I intend to > use it on just about any resource that needs to be freed: > 1) On Text_IO.File_Type > 2) On JWindows.Window > 3) On SLock.File_Locks Yes, that's exactly what I mean. Text_IO.File_Type is typically implemented using a kind of reference (see 92-1594.a and follow-up). As the number of windows in a decent system is dynamic, I guess you need to use an access type there as well. My point is that if you can finalize an access type, you can finalize anything you want, as there is no longer a penalty in using access types. > You also forgot to finalize them in the normal case. Sorry. I was really focusing on the problem of exceptions. I assume this oversight is not a real problem. > Once these are corrected, your rearrangement no longer works: No, no. As above, I'me trying to illustrate how we could allow a compiler to do without this nesting by asking a higher price from the user (i.e. that he should be capable of setting a flag without requiring finalization, frankly I think he'll manage). > Probably true. However, the ACVC writers will think otherwise ACVC programs will have to obey the rules as anyone else, no ? If the language says you have to expect the compiler to generate calls to T'INITIALIZE and T'FINALIZE for locals, then ACVC tests must expect that too. > T'INITIALIZE is a user-defined procedure. It therefore can do anything, > and the language must be prepared to deal with that. How is the language prepared to deal with the T'WRITE calling (unix) _exit? Or what if it's defining an uninitialized local and use it ? Or what about : procedure T'WRITE ... is SPILL_SOME : constant ACCESS_TYPE := new DESIGATED_TYPE; LOCK_ONE : TEXT_IO.FILE_TYPE; begin OPEN ( LOCK_ONE, OUT_FILE, NAME=>"" ); GLOBAL_SEMA.LOCK; end; It is the same case as a T'INITIALIZE that requires finalization : it doesn't get any and needs some. The language can deal with this in only one way : declare it erroneous or bounded error or simply not mention it and rely on common-sense not to do the above (as it now does for SPILL_SOME above). That's exactly how I would like to deal with the case of T'INITIALIZE that requires finalization too. There are hard limits to the protection you can offer by a language. If the programmer wants to make a program fail, he can, in any event, and he can make it virtually as "hard" as he likes, using user-controlled stuff (e.g. UNCHECKED_..., pragma INTERFACE, etc). > >By the way, I checked this limit on Alsys5.5.0 on HP9000-400, and > >there it is around 550. I expect this to be sufficient. > > I don't; if it is, we've failed to define finalization usefully. If the approach above is acceptable, then this shouldn't be a problem though... > If the resulting code is too large to run on the target, the fact that > the compiler can compile it is irrelevant. That would be a common > problem on 16-bit targets if PC-mapping is not used. I assume these comments are still expecting that the nested exception handlers are required. In any event, those targets will always have inherent limitations to what an application can do. Those that want to use them will have to live with these limitations (and hence not declare too many exception handlers either). This really is a separate issue. > >As illustrated above, I wouldn't mind getting any finalizable object > >or component get INITIALIZED before the enclosing object gets > >(logically) used. > > If this worked correctly, I would not have a problem. However, it > doesn't (if T'INITIALIZE for a component raises an exception, only > some components must get finalized. This cannot be done in a thunk [a > compiler-generated subroutine], because some exception handlers would > have to be entered, but not left, during the subroutine. Sorry, I don't see this point. If a compiler would want to avoid any unnecessary call to T'INITIALIZE, he could generate elaboration "thunks" with the nested exception handlers (assuming thunks can contain handlers). By the way, thanks all of you for sending me all those "thunk" clarifications. Could you give an example of something that can't be done ? > Since you don't know how many indexes there will be As I said, above, T'INITIALIZE is called after memory has been reserved. Surely at that time you know how many indexes there are. > Particularly because the new 11.6 will probably not define the values > of variables written within handled code (while Ada 83 does), so using > temporary variable to hold the index cannot be done Could you clarify how that would invalidate the example in 92-1630.a : procedure ARRAY_T'ELABORATE ( OBJ : in out ARRAY_T ) is begin UNTIL_EXCEPTION : for I in OBJ'RANGE loop begin COMP_T'ELABORATE ( OBJ(I) ); exception when others => -- Surely I can count on I begin valid here ??? if I /= OBJ'FIRST then for J in OBJ'FIRST .. INDEX_T'PRED(I) loop COMP_T'FINALIZE( OBJ(I) ); end loop; end if; raise; end loop UNTIL_EXCEPTION; end ARRAY_T'ELABORATE; > I understand where you're coming from, but I don't agree. First, I > think most examples will use fixed finalizable objects (like > File_Type, Window, etc.) Even most 'access types' will do that -- > consider that all of your examples have done so. I'me not talking about fixed size, I'me talking about not allocating a fixed amount. E.g. Windows are typically dynamically allocated. A window can have a (linked) list of Subwindows. Per-object overhead counts for each subwindow in the list, source overhead can count only once. > Beyond that, your scheme potentially adds a lot of code overhead to a > system - probably 10 times as much as the MRT scheme. That means you > don't break even until you have 10 times more finalized objects in the > heap than you have finalization places. I think that will be rare in > practice, particularly in embedded systems (where heap use is often > forbidden). I agree with this potential problem, but I guess this issue no longer applies if you can lower the granularity of finalization as proposed above. > I understand the worst case here. However, that case will never happen > in practice. Why? Because the overhead of very small heap allocations > (and the fragmentation potential) will eat you alive in nearly any real You argue that you don't *want* it to happen. That's not enough. I know of lots of applications that use linked lists, such as : -->NEXT----->NEXT----> OBJ--> OBJ--> e.g. this is typical when OBJ is designating a limited type (e.g. File_Type), that can't get copied in the list. > That is especially true since "all those nice compiler features" > really boil down to just composition of equality and assignment. > > Assuming used-defined assignment is available, there will be no other > significant difference (initializations and aggregates would be > available where appropriate, based in some way on the user-defined > assignment). If you make that assumption, and that all "nice features" that go away by limited-ness re-appear by the feature, then of course, all would be well. But then again, what would be the difference between private types and de-limited limited types. It would just require you to add exceptions to the 83LRM restrictions on limited types. I'me just afraid that this finalization model will stand in the way for this assumption to become true. > I think this is a fine idea, but I'm not convinced it works when a type > is sufficiently complicated. For instance, it is necessary to evaluate > the value of a discriminant before you can know what components to > initialize. > You also would have to define what happens then T'Initialize > is aborted. (see above). > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. > However, I vaguely recall some problem with this idea. Tucker? I am keeping my fingers crossed. Stef. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% End of returned mail ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-30 22:00:55 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion <> <> !discussion .... > This code is wrong. I fear most of your reply depends on a misunderstanding I generated. The code you call wrong was based on : > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. I tried to convey that idea right after the example by "This does of course rely on nobody being shocked by a T'INITIALIZE call that happens early." I should have been much more explicit about re-using T'INITIALIZE for different semantics. Sorry. To be absolutely sure we don't have that misunderstanding again : SUGGESTION FOR PHASED FINALIZATION : ------------------------------------ I would propose to approach the finalization problem in 2-phases. This to give a lot of slack space for compilers that would have difficulty with the fine granularity of the finalization proposed in 92-1630.a. T'INITIALIZE would be a low-level initialization, completely unrelated to any explicit or implicit initialization. The programmer is informed that the compiler can call T'INITIALIZE without calling T'FINALIZE ! Therefor, T'INITIALIZE implementations that require finalization are as "wrong" as forgetting about finalization. The call T'INITIALIZE(OBJ) will guarantee that OBJ will be set to some "null" value, such that it can later be passed to T'ASSIGN and/or T'FINALIZE. A T'COPY operation (supposedly something used to assign to uninitialized objects) is always replaced by a T'INITIALIZE/T'ASSIGN combination. This is of course less efficient, but it gives the compiler freedom to group elaboration operations together, by first calling T'INITIALIZE on all stuff that needs finalization, and subsequently calling T'ASSIGN on all stuff that needs finalization and (normal, high-level) initialization. The first phase happens as if all components that need finalization also had implicit initialization, but rather than calling some "thunk", you call T'INITIALIZE. Therefor, you can't possibly be wondering if you already have your discriminants set, whether access discriminants will be set, how many components there are in arrays, whether implicit heap would be used, etc... all this is known at that time (as a direct consequence of the model). The only thing special about this first phase is that it doesn't need any finalization (by definition), but permits to finalize the complete elaborated objects/subcomponents (by definition). Both assumptions are the programmers responsibility. In practice it only means he must be smart enough to set a flag. Of course the size of these groups are limited by dependencies between declarations that need finalization, but that shouldn't be a serious problem, as a large amount of such dependencies is very rare. EXAMPLE : --------- declare OBJ : T := V1; begin ... OBJ := V2; end; becomes : declare OBJ : T; begin T'INITIALIZE ( OBJ ); -- Unrelated to the initialization, should be though of as associating -- with reserving the memory for OBJ. -- It would typically set OBJ to null. -- If this raises an exception nothing is done. begin -- OBJ will get finalized from now on. T'ASSIGN ( OBJ, V1 ); -- This implements the initialization. ... T'ASSIGN ( OBJ, V2 ); -- This implements the assignment. T'FINALIZE ( OBJ ); -- Normal exit exception when others => T'FINALIZE ( OBJ ); raise; -- Abnormal exit. end; end; T could be implemented as : package SIMPLE is type T is private; ... private type DESIGNATED is ... -- self-referential, inherently limited, class-wide and/or -- unconstrained, task or protected type, whatever ... type PTR_T is access DESIGNATED; type T is record PTR : PTR_T; end record; end; procedure T'INITIALIZE ( OBJ : out T ) is -- The only one called with an uninitialized actual. begin OBJ.PTR := null; end; procedure T'FINALIZE ( OBJ : in out T ) is procedure FREE is new UNCHECKED_DEALLOCATION ( DESIGNATED, T ); begin FREE ( OBJ.PTR ); end; procedure T'ASSIGN ( LHS, RHS : in out T ) is begin T'FINALIZE ( LHS ); LHS.PTR := new DESIGNATED'( RHS ); end; SOME ALTERNATIVES : ------------------- An almost equivalent approach (suggested by Bob Duff as a second less uniform choice), would be to state that for these "user controlled" types, implicit initialization always happens (first), while any explicit initialization happens later (by calling T'ASSIGN). Bob gave this as an alternative to T'COPY, but now that (IMHO) the two-phased approach seems to bring great relief to code generation, it is perhaps more attractive that the T'INITIALIZE call. In the example above, you could simply discard T'INITIALIZE as access values get null by default. In case some compilers are intelligent about implicit initializations, this alternative may be cheaper. Of course, one can feel un-easy about calls to user-defined subprograms that are not easy to predict. As long as T'INITIALIZE simply sets a null value, this can't be a concern. People who define T'INITIALIZEs that do a TEXT_IO.GET_LINE are looking for non-deterministic behavior and will get what they wanted. If this un-easy feeling is yet too strong, one may dare to consider yet another approach : A compiler would be required to "zero" the memory allocated for objects that need finalization. This would give the programmer a low-level, non-portable ability to make T'FINALIZE and T'ASSIGN work with less visible initialization. REPLY TO 92-1711.a : -------------------- The rest of the comment is a detailed reply to Randy (92-1711.a). > This is true if finalization is only used to clean up memory resources. Remember that the issue was about user-defined storage allocators (92-1697.a). Those that use user-controlled features are willing to take responsibility for the correct behavior. In the first case the programmer need guarantee finalization of all heap objects. In the second case the programmer need guarantee finalization is not needed for anything but the memory. The essential factor is that I don't expect anything from the compiler. I just assume that user-defined pools are meant for programmers that are willing to take responsibility for correctness in return for more efficiency. I see some responsibility they could take, and how they could get efficiency in return, and note how the current plans refuse him that trade. > If finalization is almost never used, we A) wasted our time on it in > the first place, and B) the performance of the technique used is > irrelevant. If finalization is almost never used, and almost always *needed*, Ada9X has a problem, as the competition can validly claim they don't have this problem. I'me convinced that the current Ada83-C++ balance is to a significant extent due to the attention paid to OO in both languages. I think history may well repeat itself by paying insufficient attention to assignment/finalization control. > That is a very narrow view of the use of finalization. I intend to > use it on just about any resource that needs to be freed: > 1) On Text_IO.File_Type > 2) On JWindows.Window > 3) On SLock.File_Locks Yes, that's exactly what I mean. Text_IO.File_Type is typically implemented using a kind of reference (see 92-1594.a and follow-up). As the number of windows in a decent system is dynamic, I guess you need to use an access type there as well. My point is that if you can finalize an access type, you can finalize anything you want, as there is no longer a penalty in using access types. > You also forgot to finalize them in the normal case. Sorry. I was really focusing on the problem of exceptions. I assume this oversight is not a real problem. > Once these are corrected, your rearrangement no longer works: No, no. As above, I'me trying to illustrate how we could allow a compiler to do without this nesting by asking a higher price from the user (i.e. that he should be capable of setting a flag without requiring finalization, frankly I think he'll manage). > Probably true. However, the ACVC writers will think otherwise ACVC programs will have to obey the rules as anyone else, no ? If the language says you have to expect the compiler to generate calls to T'INITIALIZE and T'FINALIZE for locals, then ACVC tests must expect that too. > T'INITIALIZE is a user-defined procedure. It therefore can do anything, > and the language must be prepared to deal with that. How is the language prepared to deal with the T'WRITE calling (unix) _exit? Or what if it's defining an uninitialized local and use it ? Or what about : procedure T'WRITE ... is SPILL_SOME : constant ACCESS_TYPE := new DESIGATED_TYPE; LOCK_ONE : TEXT_IO.FILE_TYPE; begin OPEN ( LOCK_ONE, OUT_FILE, NAME=>"" ); GLOBAL_SEMA.LOCK; end; It is the same case as a T'INITIALIZE that requires finalization : it doesn't get any and needs some. The language can deal with this in only one way : declare it erroneous or bounded error or simply not mention it and rely on common-sense not to do the above (as it now does for SPILL_SOME above). That's exactly how I would like to deal with the case of T'INITIALIZE that requires finalization too. There are hard limits to the protection you can offer by a language. If the programmer wants to make a program fail, he can, in any event, and he can make it virtually as "hard" as he likes, using user-controlled stuff (e.g. UNCHECKED_..., pragma INTERFACE, etc). > >By the way, I checked this limit on Alsys5.5.0 on HP9000-400, and > >there it is around 550. I expect this to be sufficient. > > I don't; if it is, we've failed to define finalization usefully. If the approach above is acceptable, then this shouldn't be a problem though... > If the resulting code is too large to run on the target, the fact that > the compiler can compile it is irrelevant. That would be a common > problem on 16-bit targets if PC-mapping is not used. I assume these comments are still expecting that the nested exception handlers are required. In any event, those targets will always have inherent limitations to what an application can do. Those that want to use them will have to live with these limitations (and hence not declare too many exception handlers either). This really is a separate issue. > >As illustrated above, I wouldn't mind getting any finalizable object > >or component get INITIALIZED before the enclosing object gets > >(logically) used. > > If this worked correctly, I would not have a problem. However, it > doesn't (if T'INITIALIZE for a component raises an exception, only > some components must get finalized. This cannot be done in a thunk [a > compiler-generated subroutine], because some exception handlers would > have to be entered, but not left, during the subroutine. Sorry, I don't see this point. If a compiler would want to avoid any unnecessary call to T'INITIALIZE, he could generate elaboration "thunks" with the nested exception handlers (assuming thunks can contain handlers). By the way, thanks all of you for sending me all those "thunk" clarifications. Could you give an example of something that can't be done ? > Since you don't know how many indexes there will be As I said, above, T'INITIALIZE is called after memory has been reserved. Surely at that time you know how many indexes there are. > Particularly because the new 11.6 will probably not define the values > of variables written within handled code (while Ada 83 does), so using > temporary variable to hold the index cannot be done Could you clarify how that would invalidate the example in 92-1630.a : procedure ARRAY_T'ELABORATE ( OBJ : in out ARRAY_T ) is begin UNTIL_EXCEPTION : for I in OBJ'RANGE loop begin COMP_T'ELABORATE ( OBJ(I) ); exception when others => -- Surely I can count on I begin valid here ??? if I /= OBJ'FIRST then for J in OBJ'FIRST .. INDEX_T'PRED(I) loop COMP_T'FINALIZE( OBJ(I) ); end loop; end if; raise; end loop UNTIL_EXCEPTION; end ARRAY_T'ELABORATE; > I understand where you're coming from, but I don't agree. First, I > think most examples will use fixed finalizable objects (like > File_Type, Window, etc.) Even most 'access types' will do that -- > consider that all of your examples have done so. I'me not talking about fixed size, I'me talking about not allocating a fixed amount. E.g. Windows are typically dynamically allocated. A window can have a (linked) list of Subwindows. Per-object overhead counts for each subwindow in the list, source overhead can count only once. > Beyond that, your scheme potentially adds a lot of code overhead to a > system - probably 10 times as much as the MRT scheme. That means you > don't break even until you have 10 times more finalized objects in the > heap than you have finalization places. I think that will be rare in > practice, particularly in embedded systems (where heap use is often > forbidden). I agree with this potential problem, but I guess this issue no longer applies if you can lower the granularity of finalization as proposed above. > I understand the worst case here. However, that case will never happen > in practice. Why? Because the overhead of very small heap allocations > (and the fragmentation potential) will eat you alive in nearly any real You argue that you don't *want* it to happen. That's not enough. I know of lots of applications that use linked lists, such as : -->NEXT----->NEXT----> OBJ--> OBJ--> e.g. this is typical when OBJ is designating a limited type (e.g. File_Type), that can't get copied in the list. > That is especially true since "all those nice compiler features" > really boil down to just composition of equality and assignment. > > Assuming used-defined assignment is available, there will be no other > significant difference (initializations and aggregates would be > available where appropriate, based in some way on the user-defined > assignment). If you make that assumption, and that all "nice features" that go away by limited-ness re-appear by the feature, then of course, all would be well. But then again, what would be the difference between private types and de-limited limited types. It would just require you to add exceptions to the 83LRM restrictions on limited types. I'me just afraid that this finalization model will stand in the way for this assumption to become true. > I think this is a fine idea, but I'm not convinced it works when a type > is sufficiently complicated. For instance, it is necessary to evaluate > the value of a discriminant before you can know what components to > initialize. > You also would have to define what happens then T'Initialize > is aborted. (see above). > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. > However, I vaguely recall some problem with this idea. Tucker? I am keeping my fingers crossed. Stef. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% End of returned mail %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% End of returned mail ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-30 22:00:55 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion <> <> !discussion .... > This code is wrong. I fear most of your reply depends on a misunderstanding I generated. The code you call wrong was based on : > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. I tried to convey that idea right after the example by "This does of course rely on nobody being shocked by a T'INITIALIZE call that happens early." I should have been much more explicit about re-using T'INITIALIZE for different semantics. Sorry. To be absolutely sure we don't have that misunderstanding again : SUGGESTION FOR PHASED FINALIZATION : ------------------------------------ I would propose to approach the finalization problem in 2-phases. This to give a lot of slack space for compilers that would have difficulty with the fine granularity of the finalization proposed in 92-1630.a. T'INITIALIZE would be a low-level initialization, completely unrelated to any explicit or implicit initialization. The programmer is informed that the compiler can call T'INITIALIZE without calling T'FINALIZE ! Therefor, T'INITIALIZE implementations that require finalization are as "wrong" as forgetting about finalization. The call T'INITIALIZE(OBJ) will guarantee that OBJ will be set to some "null" value, such that it can later be passed to T'ASSIGN and/or T'FINALIZE. A T'COPY operation (supposedly something used to assign to uninitialized objects) is always replaced by a T'INITIALIZE/T'ASSIGN combination. This is of course less efficient, but it gives the compiler freedom to group elaboration operations together, by first calling T'INITIALIZE on all stuff that needs finalization, and subsequently calling T'ASSIGN on all stuff that needs finalization and (normal, high-level) initialization. The first phase happens as if all components that need finalization also had implicit initialization, but rather than calling some "thunk", you call T'INITIALIZE. Therefor, you can't possibly be wondering if you already have your discriminants set, whether access discriminants will be set, how many components there are in arrays, whether implicit heap would be used, etc... all this is known at that time (as a direct consequence of the model). The only thing special about this first phase is that it doesn't need any finalization (by definition), but permits to finalize the complete elaborated objects/subcomponents (by definition). Both assumptions are the programmers responsibility. In practice it only means he must be smart enough to set a flag. Of course the size of these groups are limited by dependencies between declarations that need finalization, but that shouldn't be a serious problem, as a large amount of such dependencies is very rare. EXAMPLE : --------- declare OBJ : T := V1; begin ... OBJ := V2; end; becomes : declare OBJ : T; begin T'INITIALIZE ( OBJ ); -- Unrelated to the initialization, should be though of as associating -- with reserving the memory for OBJ. -- It would typically set OBJ to null. -- If this raises an exception nothing is done. begin -- OBJ will get finalized from now on. T'ASSIGN ( OBJ, V1 ); -- This implements the initialization. ... T'ASSIGN ( OBJ, V2 ); -- This implements the assignment. T'FINALIZE ( OBJ ); -- Normal exit exception when others => T'FINALIZE ( OBJ ); raise; -- Abnormal exit. end; end; T could be implemented as : package SIMPLE is type T is private; ... private type DESIGNATED is ... -- self-referential, inherently limited, class-wide and/or -- unconstrained, task or protected type, whatever ... type PTR_T is access DESIGNATED; type T is record PTR : PTR_T; end record; end; procedure T'INITIALIZE ( OBJ : out T ) is -- The only one called with an uninitialized actual. begin OBJ.PTR := null; end; procedure T'FINALIZE ( OBJ : in out T ) is procedure FREE is new UNCHECKED_DEALLOCATION ( DESIGNATED, T ); begin FREE ( OBJ.PTR ); end; procedure T'ASSIGN ( LHS, RHS : in out T ) is begin T'FINALIZE ( LHS ); LHS.PTR := new DESIGNATED'( RHS ); end; SOME ALTERNATIVES : ------------------- An almost equivalent approach (suggested by Bob Duff as a second less uniform choice), would be to state that for these "user controlled" types, implicit initialization always happens (first), while any explicit initialization happens later (by calling T'ASSIGN). Bob gave this as an alternative to T'COPY, but now that (IMHO) the two-phased approach seems to bring great relief to code generation, it is perhaps more attractive that the T'INITIALIZE call. In the example above, you could simply discard T'INITIALIZE as access values get null by default. In case some compilers are intelligent about implicit initializations, this alternative may be cheaper. Of course, one can feel un-easy about calls to user-defined subprograms that are not easy to predict. As long as T'INITIALIZE simply sets a null value, this can't be a concern. People who define T'INITIALIZEs that do a TEXT_IO.GET_LINE are looking for non-deterministic behavior and will get what they wanted. If this un-easy feeling is yet too strong, one may dare to consider yet another approach : A compiler would be required to "zero" the memory allocated for objects that need finalization. This would give the programmer a low-level, non-portable ability to make T'FINALIZE and T'ASSIGN work with less visible initialization. REPLY TO 92-1711.a : -------------------- The rest of the comment is a detailed reply to Randy (92-1711.a). > This is true if finalization is only used to clean up memory resources. Remember that the issue was about user-defined storage allocators (92-1697.a). Those that use user-controlled features are willing to take responsibility for the correct behavior. In the first case the programmer need guarantee finalization of all heap objects. In the second case the programmer need guarantee finalization is not needed for anything but the memory. The essential factor is that I don't expect anything from the compiler. I just assume that user-defined pools are meant for programmers that are willing to take responsibility for correctness in return for more efficiency. I see some responsibility they could take, and how they could get efficiency in return, and note how the current plans refuse him that trade. > If finalization is almost never used, we A) wasted our time on it in > the first place, and B) the performance of the technique used is > irrelevant. If finalization is almost never used, and almost always *needed*, Ada9X has a problem, as the competition can validly claim they don't have this problem. I'me convinced that the current Ada83-C++ balance is to a significant extent due to the attention paid to OO in both languages. I think history may well repeat itself by paying insufficient attention to assignment/finalization control. > That is a very narrow view of the use of finalization. I intend to > use it on just about any resource that needs to be freed: > 1) On Text_IO.File_Type > 2) On JWindows.Window > 3) On SLock.File_Locks Yes, that's exactly what I mean. Text_IO.File_Type is typically implemented using a kind of reference (see 92-1594.a and follow-up). As the number of windows in a decent system is dynamic, I guess you need to use an access type there as well. My point is that if you can finalize an access type, you can finalize anything you want, as there is no longer a penalty in using access types. > You also forgot to finalize them in the normal case. Sorry. I was really focusing on the problem of exceptions. I assume this oversight is not a real problem. > Once these are corrected, your rearrangement no longer works: No, no. As above, I'me trying to illustrate how we could allow a compiler to do without this nesting by asking a higher price from the user (i.e. that he should be capable of setting a flag without requiring finalization, frankly I think he'll manage). > Probably true. However, the ACVC writers will think otherwise ACVC programs will have to obey the rules as anyone else, no ? If the language says you have to expect the compiler to generate calls to T'INITIALIZE and T'FINALIZE for locals, then ACVC tests must expect that too. > T'INITIALIZE is a user-defined procedure. It therefore can do anything, > and the language must be prepared to deal with that. How is the language prepared to deal with the T'WRITE calling (unix) _exit? Or what if it's defining an uninitialized local and use it ? Or what about : procedure T'WRITE ... is SPILL_SOME : constant ACCESS_TYPE := new DESIGATED_TYPE; LOCK_ONE : TEXT_IO.FILE_TYPE; begin OPEN ( LOCK_ONE, OUT_FILE, NAME=>"" ); GLOBAL_SEMA.LOCK; end; It is the same case as a T'INITIALIZE that requires finalization : it doesn't get any and needs some. The language can deal with this in only one way : declare it erroneous or bounded error or simply not mention it and rely on common-sense not to do the above (as it now does for SPILL_SOME above). That's exactly how I would like to deal with the case of T'INITIALIZE that requires finalization too. There are hard limits to the protection you can offer by a language. If the programmer wants to make a program fail, he can, in any event, and he can make it virtually as "hard" as he likes, using user-controlled stuff (e.g. UNCHECKED_..., pragma INTERFACE, etc). > >By the way, I checked this limit on Alsys5.5.0 on HP9000-400, and > >there it is around 550. I expect this to be sufficient. > > I don't; if it is, we've failed to define finalization usefully. If the approach above is acceptable, then this shouldn't be a problem though... > If the resulting code is too large to run on the target, the fact that > the compiler can compile it is irrelevant. That would be a common > problem on 16-bit targets if PC-mapping is not used. I assume these comments are still expecting that the nested exception handlers are required. In any event, those targets will always have inherent limitations to what an application can do. Those that want to use them will have to live with these limitations (and hence not declare too many exception handlers either). This really is a separate issue. > >As illustrated above, I wouldn't mind getting any finalizable object > >or component get INITIALIZED before the enclosing object gets > >(logically) used. > > If this worked correctly, I would not have a problem. However, it > doesn't (if T'INITIALIZE for a component raises an exception, only > some components must get finalized. This cannot be done in a thunk [a > compiler-generated subroutine], because some exception handlers would > have to be entered, but not left, during the subroutine. Sorry, I don't see this point. If a compiler would want to avoid any unnecessary call to T'INITIALIZE, he could generate elaboration "thunks" with the nested exception handlers (assuming thunks can contain handlers). By the way, thanks all of you for sending me all those "thunk" clarifications. Could you give an example of something that can't be done ? > Since you don't know how many indexes there will be As I said, above, T'INITIALIZE is called after memory has been reserved. Surely at that time you know how many indexes there are. > Particularly because the new 11.6 will probably not define the values > of variables written within handled code (while Ada 83 does), so using > temporary variable to hold the index cannot be done Could you clarify how that would invalidate the example in 92-1630.a : procedure ARRAY_T'ELABORATE ( OBJ : in out ARRAY_T ) is begin UNTIL_EXCEPTION : for I in OBJ'RANGE loop begin COMP_T'ELABORATE ( OBJ(I) ); exception when others => -- Surely I can count on I begin valid here ??? if I /= OBJ'FIRST then for J in OBJ'FIRST .. INDEX_T'PRED(I) loop COMP_T'FINALIZE( OBJ(I) ); end loop; end if; raise; end loop UNTIL_EXCEPTION; end ARRAY_T'ELABORATE; > I understand where you're coming from, but I don't agree. First, I > think most examples will use fixed finalizable objects (like > File_Type, Window, etc.) Even most 'access types' will do that -- > consider that all of your examples have done so. I'me not talking about fixed size, I'me talking about not allocating a fixed amount. E.g. Windows are typically dynamically allocated. A window can have a (linked) list of Subwindows. Per-object overhead counts for each subwindow in the list, source overhead can count only once. > Beyond that, your scheme potentially adds a lot of code overhead to a > system - probably 10 times as much as the MRT scheme. That means you > don't break even until you have 10 times more finalized objects in the > heap than you have finalization places. I think that will be rare in > practice, particularly in embedded systems (where heap use is often > forbidden). I agree with this potential problem, but I guess this issue no longer applies if you can lower the granularity of finalization as proposed above. > I understand the worst case here. However, that case will never happen > in practice. Why? Because the overhead of very small heap allocations > (and the fragmentation potential) will eat you alive in nearly any real You argue that you don't *want* it to happen. That's not enough. I know of lots of applications that use linked lists, such as : -->NEXT----->NEXT----> OBJ--> OBJ--> e.g. this is typical when OBJ is designating a limited type (e.g. File_Type), that can't get copied in the list. > That is especially true since "all those nice compiler features" > really boil down to just composition of equality and assignment. > > Assuming used-defined assignment is available, there will be no other > significant difference (initializations and aggregates would be > available where appropriate, based in some way on the user-defined > assignment). If you make that assumption, and that all "nice features" that go away by limited-ness re-appear by the feature, then of course, all would be well. But then again, what would be the difference between private types and de-limited limited types. It would just require you to add exceptions to the 83LRM restrictions on limited types. I'me just afraid that this finalization model will stand in the way for this assumption to become true. > I think this is a fine idea, but I'm not convinced it works when a type > is sufficiently complicated. For instance, it is necessary to evaluate > the value of a discriminant before you can know what components to > initialize. > You also would have to define what happens then T'Initialize > is aborted. (see above). > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. > However, I vaguely recall some problem with this idea. Tucker? I am keeping my fingers crossed. Stef. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% End of returned mail %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% End of returned mail ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-30 22:00:55 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion <> <> !discussion .... > This code is wrong. I fear most of your reply depends on a misunderstanding I generated. The code you call wrong was based on : > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. I tried to convey that idea right after the example by "This does of course rely on nobody being shocked by a T'INITIALIZE call that happens early." I should have been much more explicit about re-using T'INITIALIZE for different semantics. Sorry. To be absolutely sure we don't have that misunderstanding again : SUGGESTION FOR PHASED FINALIZATION : ------------------------------------ I would propose to approach the finalization problem in 2-phases. This to give a lot of slack space for compilers that would have difficulty with the fine granularity of the finalization proposed in 92-1630.a. T'INITIALIZE would be a low-level initialization, completely unrelated to any explicit or implicit initialization. The programmer is informed that the compiler can call T'INITIALIZE without calling T'FINALIZE ! Therefor, T'INITIALIZE implementations that require finalization are as "wrong" as forgetting about finalization. The call T'INITIALIZE(OBJ) will guarantee that OBJ will be set to some "null" value, such that it can later be passed to T'ASSIGN and/or T'FINALIZE. A T'COPY operation (supposedly something used to assign to uninitialized objects) is always replaced by a T'INITIALIZE/T'ASSIGN combination. This is of course less efficient, but it gives the compiler freedom to group elaboration operations together, by first calling T'INITIALIZE on all stuff that needs finalization, and subsequently calling T'ASSIGN on all stuff that needs finalization and (normal, high-level) initialization. The first phase happens as if all components that need finalization also had implicit initialization, but rather than calling some "thunk", you call T'INITIALIZE. Therefor, you can't possibly be wondering if you already have your discriminants set, whether access discriminants will be set, how many components there are in arrays, whether implicit heap would be used, etc... all this is known at that time (as a direct consequence of the model). The only thing special about this first phase is that it doesn't need any finalization (by definition), but permits to finalize the complete elaborated objects/subcomponents (by definition). Both assumptions are the programmers responsibility. In practice it only means he must be smart enough to set a flag. Of course the size of these groups are limited by dependencies between declarations that need finalization, but that shouldn't be a serious problem, as a large amount of such dependencies is very rare. EXAMPLE : --------- declare OBJ : T := V1; begin ... OBJ := V2; end; becomes : declare OBJ : T; begin T'INITIALIZE ( OBJ ); -- Unrelated to the initialization, should be though of as associating -- with reserving the memory for OBJ. -- It would typically set OBJ to null. -- If this raises an exception nothing is done. begin -- OBJ will get finalized from now on. T'ASSIGN ( OBJ, V1 ); -- This implements the initialization. ... T'ASSIGN ( OBJ, V2 ); -- This implements the assignment. T'FINALIZE ( OBJ ); -- Normal exit exception when others => T'FINALIZE ( OBJ ); raise; -- Abnormal exit. end; end; T could be implemented as : package SIMPLE is type T is private; ... private type DESIGNATED is ... -- self-referential, inherently limited, class-wide and/or -- unconstrained, task or protected type, whatever ... type PTR_T is access DESIGNATED; type T is record PTR : PTR_T; end record; end; procedure T'INITIALIZE ( OBJ : out T ) is -- The only one called with an uninitialized actual. begin OBJ.PTR := null; end; procedure T'FINALIZE ( OBJ : in out T ) is procedure FREE is new UNCHECKED_DEALLOCATION ( DESIGNATED, T ); begin FREE ( OBJ.PTR ); end; procedure T'ASSIGN ( LHS, RHS : in out T ) is begin T'FINALIZE ( LHS ); LHS.PTR := new DESIGNATED'( RHS ); end; SOME ALTERNATIVES : ------------------- An almost equivalent approach (suggested by Bob Duff as a second less uniform choice), would be to state that for these "user controlled" types, implicit initialization always happens (first), while any explicit initialization happens later (by calling T'ASSIGN). Bob gave this as an alternative to T'COPY, but now that (IMHO) the two-phased approach seems to bring great relief to code generation, it is perhaps more attractive that the T'INITIALIZE call. In the example above, you could simply discard T'INITIALIZE as access values get null by default. In case some compilers are intelligent about implicit initializations, this alternative may be cheaper. Of course, one can feel un-easy about calls to user-defined subprograms that are not easy to predict. As long as T'INITIALIZE simply sets a null value, this can't be a concern. People who define T'INITIALIZEs that do a TEXT_IO.GET_LINE are looking for non-deterministic behavior and will get what they wanted. If this un-easy feeling is yet too strong, one may dare to consider yet another approach : A compiler would be required to "zero" the memory allocated for objects that need finalization. This would give the programmer a low-level, non-portable ability to make T'FINALIZE and T'ASSIGN work with less visible initialization. REPLY TO 92-1711.a : -------------------- The rest of the comment is a detailed reply to Randy (92-1711.a). > This is true if finalization is only used to clean up memory resources. Remember that the issue was about user-defined storage allocators (92-1697.a). Those that use user-controlled features are willing to take responsibility for the correct behavior. In the first case the programmer need guarantee finalization of all heap objects. In the second case the programmer need guarantee finalization is not needed for anything but the memory. The essential factor is that I don't expect anything from the compiler. I just assume that user-defined pools are meant for programmers that are willing to take responsibility for correctness in return for more efficiency. I see some responsibility they could take, and how they could get efficiency in return, and note how the current plans refuse him that trade. > If finalization is almost never used, we A) wasted our time on it in > the first place, and B) the performance of the technique used is > irrelevant. If finalization is almost never used, and almost always *needed*, Ada9X has a problem, as the competition can validly claim they don't have this problem. I'me convinced that the current Ada83-C++ balance is to a significant extent due to the attention paid to OO in both languages. I think history may well repeat itself by paying insufficient attention to assignment/finalization control. > That is a very narrow view of the use of finalization. I intend to > use it on just about any resource that needs to be freed: > 1) On Text_IO.File_Type > 2) On JWindows.Window > 3) On SLock.File_Locks Yes, that's exactly what I mean. Text_IO.File_Type is typically implemented using a kind of reference (see 92-1594.a and follow-up). As the number of windows in a decent system is dynamic, I guess you need to use an access type there as well. My point is that if you can finalize an access type, you can finalize anything you want, as there is no longer a penalty in using access types. > You also forgot to finalize them in the normal case. Sorry. I was really focusing on the problem of exceptions. I assume this oversight is not a real problem. > Once these are corrected, your rearrangement no longer works: No, no. As above, I'me trying to illustrate how we could allow a compiler to do without this nesting by asking a higher price from the user (i.e. that he should be capable of setting a flag without requiring finalization, frankly I think he'll manage). > Probably true. However, the ACVC writers will think otherwise ACVC programs will have to obey the rules as anyone else, no ? If the language says you have to expect the compiler to generate calls to T'INITIALIZE and T'FINALIZE for locals, then ACVC tests must expect that too. > T'INITIALIZE is a user-defined procedure. It therefore can do anything, > and the language must be prepared to deal with that. How is the language prepared to deal with the T'WRITE calling (unix) _exit? Or what if it's defining an uninitialized local and use it ? Or what about : procedure T'WRITE ... is SPILL_SOME : constant ACCESS_TYPE := new DESIGATED_TYPE; LOCK_ONE : TEXT_IO.FILE_TYPE; begin OPEN ( LOCK_ONE, OUT_FILE, NAME=>"" ); GLOBAL_SEMA.LOCK; end; It is the same case as a T'INITIALIZE that requires finalization : it doesn't get any and needs some. The language can deal with this in only one way : declare it erroneous or bounded error or simply not mention it and rely on common-sense not to do the above (as it now does for SPILL_SOME above). That's exactly how I would like to deal with the case of T'INITIALIZE that requires finalization too. There are hard limits to the protection you can offer by a language. If the programmer wants to make a program fail, he can, in any event, and he can make it virtually as "hard" as he likes, using user-controlled stuff (e.g. UNCHECKED_..., pragma INTERFACE, etc). > >By the way, I checked this limit on Alsys5.5.0 on HP9000-400, and > >there it is around 550. I expect this to be sufficient. > > I don't; if it is, we've failed to define finalization usefully. If the approach above is acceptable, then this shouldn't be a problem though... > If the resulting code is too large to run on the target, the fact that > the compiler can compile it is irrelevant. That would be a common > problem on 16-bit targets if PC-mapping is not used. I assume these comments are still expecting that the nested exception handlers are required. In any event, those targets will always have inherent limitations to what an application can do. Those that want to use them will have to live with these limitations (and hence not declare too many exception handlers either). This really is a separate issue. > >As illustrated above, I wouldn't mind getting any finalizable object > >or component get INITIALIZED before the enclosing object gets > >(logically) used. > > If this worked correctly, I would not have a problem. However, it > doesn't (if T'INITIALIZE for a component raises an exception, only > some components must get finalized. This cannot be done in a thunk [a > compiler-generated subroutine], because some exception handlers would > have to be entered, but not left, during the subroutine. Sorry, I don't see this point. If a compiler would want to avoid any unnecessary call to T'INITIALIZE, he could generate elaboration "thunks" with the nested exception handlers (assuming thunks can contain handlers). By the way, thanks all of you for sending me all those "thunk" clarifications. Could you give an example of something that can't be done ? > Since you don't know how many indexes there will be As I said, above, T'INITIALIZE is called after memory has been reserved. Surely at that time you know how many indexes there are. > Particularly because the new 11.6 will probably not define the values > of variables written within handled code (while Ada 83 does), so using > temporary variable to hold the index cannot be done Could you clarify how that would invalidate the example in 92-1630.a : procedure ARRAY_T'ELABORATE ( OBJ : in out ARRAY_T ) is begin UNTIL_EXCEPTION : for I in OBJ'RANGE loop begin COMP_T'ELABORATE ( OBJ(I) ); exception when others => -- Surely I can count on I begin valid here ??? if I /= OBJ'FIRST then for J in OBJ'FIRST .. INDEX_T'PRED(I) loop COMP_T'FINALIZE( OBJ(I) ); end loop; end if; raise; end loop UNTIL_EXCEPTION; end ARRAY_T'ELABORATE; > I understand where you're coming from, but I don't agree. First, I > think most examples will use fixed finalizable objects (like > File_Type, Window, etc.) Even most 'access types' will do that -- > consider that all of your examples have done so. I'me not talking about fixed size, I'me talking about not allocating a fixed amount. E.g. Windows are typically dynamically allocated. A window can have a (linked) list of Subwindows. Per-object overhead counts for each subwindow in the list, source overhead can count only once. > Beyond that, your scheme potentially adds a lot of code overhead to a > system - probably 10 times as much as the MRT scheme. That means you > don't break even until you have 10 times more finalized objects in the > heap than you have finalization places. I think that will be rare in > practice, particularly in embedded systems (where heap use is often > forbidden). I agree with this potential problem, but I guess this issue no longer applies if you can lower the granularity of finalization as proposed above. > I understand the worst case here. However, that case will never happen > in practice. Why? Because the overhead of very small heap allocations > (and the fragmentation potential) will eat you alive in nearly any real You argue that you don't *want* it to happen. That's not enough. I know of lots of applications that use linked lists, such as : -->NEXT----->NEXT----> OBJ--> OBJ--> e.g. this is typical when OBJ is designating a limited type (e.g. File_Type), that can't get copied in the list. > That is especially true since "all those nice compiler features" > really boil down to just composition of equality and assignment. > > Assuming used-defined assignment is available, there will be no other > significant difference (initializations and aggregates would be > available where appropriate, based in some way on the user-defined > assignment). If you make that assumption, and that all "nice features" that go away by limited-ness re-appear by the feature, then of course, all would be well. But then again, what would be the difference between private types and de-limited limited types. It would just require you to add exceptions to the 83LRM restrictions on limited types. I'me just afraid that this finalization model will stand in the way for this assumption to become true. > I think this is a fine idea, but I'm not convinced it works when a type > is sufficiently complicated. For instance, it is necessary to evaluate > the value of a discriminant before you can know what components to > initialize. > You also would have to define what happens then T'Initialize > is aborted. (see above). > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. > However, I vaguely recall some problem with this idea. Tucker? I am keeping my fingers crossed. Stef. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% End of returned mail %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% End of returned mail ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-30 22:00:55 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion .... > This code is wrong. I fear most of your reply depends on a misunderstanding I generated. The code you call wrong was based on : > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. I tried to convey that idea right after the example by "This does of course rely on nobody being shocked by a T'INITIALIZE call that happens early." I should have been much more explicit about re-using T'INITIALIZE for different semantics. Sorry. To be absolutely sure we don't have that misunderstanding again : SUGGESTION FOR PHASED FINALIZATION : ------------------------------------ I would propose to approach the finalization problem in 2-phases. This to give a lot of slack space for compilers that would have difficulty with the fine granularity of the finalization proposed in 92-1630.a. T'INITIALIZE would be a low-level initialization, completely unrelated to any explicit or implicit initialization. The programmer is informed that the compiler can call T'INITIALIZE without calling T'FINALIZE ! Therefor, T'INITIALIZE implementations that require finalization are as "wrong" as forgetting about finalization. The call T'INITIALIZE(OBJ) will guarantee that OBJ will be set to some "null" value, such that it can later be passed to T'ASSIGN and/or T'FINALIZE. A T'COPY operation (supposedly something used to assign to uninitialized objects) is always replaced by a T'INITIALIZE/T'ASSIGN combination. This is of course less efficient, but it gives the compiler freedom to group elaboration operations together, by first calling T'INITIALIZE on all stuff that needs finalization, and subsequently calling T'ASSIGN on all stuff that needs finalization and (normal, high-level) initialization. The first phase happens as if all components that need finalization also had implicit initialization, but rather than calling some "thunk", you call T'INITIALIZE. Therefor, you can't possibly be wondering if you already have your discriminants set, whether access discriminants will be set, how many components there are in arrays, whether implicit heap would be used, etc... all this is known at that time (as a direct consequence of the model). The only thing special about this first phase is that it doesn't need any finalization (by definition), but permits to finalize the complete elaborated objects/subcomponents (by definition). Both assumptions are the programmers responsibility. In practice it only means he must be smart enough to set a flag. Of course the size of these groups are limited by dependencies between declarations that need finalization, but that shouldn't be a serious problem, as a large amount of such dependencies is very rare. EXAMPLE : --------- declare OBJ : T := V1; begin ... OBJ := V2; end; becomes : declare OBJ : T; begin T'INITIALIZE ( OBJ ); -- Unrelated to the initialization, should be though of as associating -- with reserving the memory for OBJ. -- It would typically set OBJ to null. -- If this raises an exception nothing is done. begin -- OBJ will get finalized from now on. T'ASSIGN ( OBJ, V1 ); -- This implements the initialization. ... T'ASSIGN ( OBJ, V2 ); -- This implements the assignment. T'FINALIZE ( OBJ ); -- Normal exit exception when others => T'FINALIZE ( OBJ ); raise; -- Abnormal exit. end; end; T could be implemented as : package SIMPLE is type T is private; ... private type DESIGNATED is ... -- self-referential, inherently limited, class-wide and/or -- unconstrained, task or protected type, whatever ... type PTR_T is access DESIGNATED; type T is record PTR : PTR_T; end record; end; procedure T'INITIALIZE ( OBJ : out T ) is -- The only one called with an uninitialized actual. begin OBJ.PTR := null; end; procedure T'FINALIZE ( OBJ : in out T ) is procedure FREE is new UNCHECKED_DEALLOCATION ( DESIGNATED, T ); begin FREE ( OBJ.PTR ); end; procedure T'ASSIGN ( LHS, RHS : in out T ) is begin T'FINALIZE ( LHS ); LHS.PTR := new DESIGNATED'( RHS ); end; SOME ALTERNATIVES : ------------------- An almost equivalent approach (suggested by Bob Duff as a second less uniform choice), would be to state that for these "user controlled" types, implicit initialization always happens (first), while any explicit initialization happens later (by calling T'ASSIGN). Bob gave this as an alternative to T'COPY, but now that (IMHO) the two-phased approach seems to bring great relief to code generation, it is perhaps more attractive that the T'INITIALIZE call. In the example above, you could simply discard T'INITIALIZE as access values get null by default. In case some compilers are intelligent about implicit initializations, this alternative may be cheaper. Of course, one can feel un-easy about calls to user-defined subprograms that are not easy to predict. As long as T'INITIALIZE simply sets a null value, this can't be a concern. People who define T'INITIALIZEs that do a TEXT_IO.GET_LINE are looking for non-deterministic behavior and will get what they wanted. If this un-easy feeling is yet too strong, one may dare to consider yet another approach : A compiler would be required to "zero" the memory allocated for objects that need finalization. This would give the programmer a low-level, non-portable ability to make T'FINALIZE and T'ASSIGN work with less visible initialization. REPLY TO 92-1711.a : -------------------- The rest of the comment is a detailed reply to Randy (92-1711.a). > This is true if finalization is only used to clean up memory resources. Remember that the issue was about user-defined storage allocators (92-1697.a). Those that use user-controlled features are willing to take responsibility for the correct behavior. In the first case the programmer need guarantee finalization of all heap objects. In the second case the programmer need guarantee finalization is not needed for anything but the memory. The essential factor is that I don't expect anything from the compiler. I just assume that user-defined pools are meant for programmers that are willing to take responsibility for correctness in return for more efficiency. I see some responsibility they could take, and how they could get efficiency in return, and note how the current plans refuse him that trade. > If finalization is almost never used, we A) wasted our time on it in > the first place, and B) the performance of the technique used is > irrelevant. If finalization is almost never used, and almost always *needed*, Ada9X has a problem, as the competition can validly claim they don't have this problem. I'me convinced that the current Ada83-C++ balance is to a significant extent due to the attention paid to OO in both languages. I think history may well repeat itself by paying insufficient attention to assignment/finalization control. > That is a very narrow view of the use of finalization. I intend to > use it on just about any resource that needs to be freed: > 1) On Text_IO.File_Type > 2) On JWindows.Window > 3) On SLock.File_Locks Yes, that's exactly what I mean. Text_IO.File_Type is typically implemented using a kind of reference (see 92-1594.a and follow-up). As the number of windows in a decent system is dynamic, I guess you need to use an access type there as well. My point is that if you can finalize an access type, you can finalize anything you want, as there is no longer a penalty in using access types. > You also forgot to finalize them in the normal case. Sorry. I was really focusing on the problem of exceptions. I assume this oversight is not a real problem. > Once these are corrected, your rearrangement no longer works: No, no. As above, I'me trying to illustrate how we could allow a compiler to do without this nesting by asking a higher price from the user (i.e. that he should be capable of setting a flag without requiring finalization, frankly I think he'll manage). > Probably true. However, the ACVC writers will think otherwise ACVC programs will have to obey the rules as anyone else, no ? If the language says you have to expect the compiler to generate calls to T'INITIALIZE and T'FINALIZE for locals, then ACVC tests must expect that too. > T'INITIALIZE is a user-defined procedure. It therefore can do anything, > and the language must be prepared to deal with that. How is the language prepared to deal with the T'WRITE calling (unix) _exit? Or what if it's defining an uninitialized local and use it ? Or what about : procedure T'WRITE ... is SPILL_SOME : constant ACCESS_TYPE := new DESIGATED_TYPE; LOCK_ONE : TEXT_IO.FILE_TYPE; begin OPEN ( LOCK_ONE, OUT_FILE, NAME=>"" ); GLOBAL_SEMA.LOCK; end; It is the same case as a T'INITIALIZE that requires finalization : it doesn't get any and needs some. The language can deal with this in only one way : declare it erroneous or bounded error or simply not mention it and rely on common-sense not to do the above (as it now does for SPILL_SOME above). That's exactly how I would like to deal with the case of T'INITIALIZE that requires finalization too. There are hard limits to the protection you can offer by a language. If the programmer wants to make a program fail, he can, in any event, and he can make it virtually as "hard" as he likes, using user-controlled stuff (e.g. UNCHECKED_..., pragma INTERFACE, etc). > >By the way, I checked this limit on Alsys5.5.0 on HP9000-400, and > >there it is around 550. I expect this to be sufficient. > > I don't; if it is, we've failed to define finalization usefully. If the approach above is acceptable, then this shouldn't be a problem though... > If the resulting code is too large to run on the target, the fact that > the compiler can compile it is irrelevant. That would be a common > problem on 16-bit targets if PC-mapping is not used. I assume these comments are still expecting that the nested exception handlers are required. In any event, those targets will always have inherent limitations to what an application can do. Those that want to use them will have to live with these limitations (and hence not declare too many exception handlers either). This really is a separate issue. > >As illustrated above, I wouldn't mind getting any finalizable object > >or component get INITIALIZED before the enclosing object gets > >(logically) used. > > If this worked correctly, I would not have a problem. However, it > doesn't (if T'INITIALIZE for a component raises an exception, only > some components must get finalized. This cannot be done in a thunk [a > compiler-generated subroutine], because some exception handlers would > have to be entered, but not left, during the subroutine. Sorry, I don't see this point. If a compiler would want to avoid any unnecessary call to T'INITIALIZE, he could generate elaboration "thunks" with the nested exception handlers (assuming thunks can contain handlers). By the way, thanks all of you for sending me all those "thunk" clarifications. Could you give an example of something that can't be done ? > Since you don't know how many indexes there will be As I said, above, T'INITIALIZE is called after memory has been reserved. Surely at that time you know how many indexes there are. > Particularly because the new 11.6 will probably not define the values > of variables written within handled code (while Ada 83 does), so using > temporary variable to hold the index cannot be done Could you clarify how that would invalidate the example in 92-1630.a : procedure ARRAY_T'ELABORATE ( OBJ : in out ARRAY_T ) is begin UNTIL_EXCEPTION : for I in OBJ'RANGE loop begin COMP_T'ELABORATE ( OBJ(I) ); exception when others => -- Surely I can count on I begin valid here ??? if I /= OBJ'FIRST then for J in OBJ'FIRST .. INDEX_T'PRED(I) loop COMP_T'FINALIZE( OBJ(I) ); end loop; end if; raise; end loop UNTIL_EXCEPTION; end ARRAY_T'ELABORATE; > I understand where you're coming from, but I don't agree. First, I > think most examples will use fixed finalizable objects (like > File_Type, Window, etc.) Even most 'access types' will do that -- > consider that all of your examples have done so. I'me not talking about fixed size, I'me talking about not allocating a fixed amount. E.g. Windows are typically dynamically allocated. A window can have a (linked) list of Subwindows. Per-object overhead counts for each subwindow in the list, source overhead can count only once. > Beyond that, your scheme potentially adds a lot of code overhead to a > system - probably 10 times as much as the MRT scheme. That means you > don't break even until you have 10 times more finalized objects in the > heap than you have finalization places. I think that will be rare in > practice, particularly in embedded systems (where heap use is often > forbidden). I agree with this potential problem, but I guess this issue no longer applies if you can lower the granularity of finalization as proposed above. > I understand the worst case here. However, that case will never happen > in practice. Why? Because the overhead of very small heap allocations > (and the fragmentation potential) will eat you alive in nearly any real You argue that you don't *want* it to happen. That's not enough. I know of lots of applications that use linked lists, such as : -->NEXT----->NEXT----> OBJ--> OBJ--> e.g. this is typical when OBJ is designating a limited type (e.g. File_Type), that can't get copied in the list. > That is especially true since "all those nice compiler features" > really boil down to just composition of equality and assignment. > > Assuming used-defined assignment is available, there will be no other > significant difference (initializations and aggregates would be > available where appropriate, based in some way on the user-defined > assignment). If you make that assumption, and that all "nice features" that go away by limited-ness re-appear by the feature, then of course, all would be well. But then again, what would be the difference between private types and de-limited limited types. It would just require you to add exceptions to the 83LRM restrictions on limited types. I'me just afraid that this finalization model will stand in the way for this assumption to become true. > I think this is a fine idea, but I'm not convinced it works when a type > is sufficiently complicated. For instance, it is necessary to evaluate > the value of a discriminant before you can know what components to > initialize. > You also would have to define what happens then T'Initialize > is aborted. (see above). > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. > However, I vaguely recall some problem with this idea. Tucker? I am keeping my fingers crossed. Stef. ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-30 22:00:55 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion .... > This code is wrong. I fear most of your reply depends on a misunderstanding I generated. The code you call wrong was based on : > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. I tried to convey that idea right after the example by "This does of course rely on nobody being shocked by a T'INITIALIZE call that happens early." I should have been much more explicit about re-using T'INITIALIZE for different semantics. Sorry. To be absolutely sure we don't have that misunderstanding again : SUGGESTION FOR PHASED FINALIZATION : ------------------------------------ I would propose to approach the finalization problem in 2-phases. This to give a lot of slack space for compilers that would have difficulty with the fine granularity of the finalization proposed in 92-1630.a. T'INITIALIZE would be a low-level initialization, completely unrelated to any explicit or implicit initialization. The programmer is informed that the compiler can call T'INITIALIZE without calling T'FINALIZE ! Therefor, T'INITIALIZE implementations that require finalization are as "wrong" as forgetting about finalization. The call T'INITIALIZE(OBJ) will guarantee that OBJ will be set to some "null" value, such that it can later be passed to T'ASSIGN and/or T'FINALIZE. A T'COPY operation (supposedly something used to assign to uninitialized objects) is always replaced by a T'INITIALIZE/T'ASSIGN combination. This is of course less efficient, but it gives the compiler freedom to group elaboration operations together, by first calling T'INITIALIZE on all stuff that needs finalization, and subsequently calling T'ASSIGN on all stuff that needs finalization and (normal, high-level) initialization. The first phase happens as if all components that need finalization also had implicit initialization, but rather than calling some "thunk", you call T'INITIALIZE. Therefor, you can't possibly be wondering if you already have your discriminants set, whether access discriminants will be set, how many components there are in arrays, whether implicit heap would be used, etc... all this is known at that time (as a direct consequence of the model). The only thing special about this first phase is that it doesn't need any finalization (by definition), but permits to finalize the complete elaborated objects/subcomponents (by definition). Both assumptions are the programmers responsibility. In practice it only means he must be smart enough to set a flag. Of course the size of these groups are limited by dependencies between declarations that need finalization, but that shouldn't be a serious problem, as a large amount of such dependencies is very rare. EXAMPLE : --------- declare OBJ : T := V1; begin ... OBJ := V2; end; becomes : declare OBJ : T; begin T'INITIALIZE ( OBJ ); -- Unrelated to the initialization, should be though of as associating -- with reserving the memory for OBJ. -- It would typically set OBJ to null. -- If this raises an exception nothing is done. begin -- OBJ will get finalized from now on. T'ASSIGN ( OBJ, V1 ); -- This implements the initialization. ... T'ASSIGN ( OBJ, V2 ); -- This implements the assignment. T'FINALIZE ( OBJ ); -- Normal exit exception when others => T'FINALIZE ( OBJ ); raise; -- Abnormal exit. end; end; T could be implemented as : package SIMPLE is type T is private; ... private type DESIGNATED is ... -- self-referential, inherently limited, class-wide and/or -- unconstrained, task or protected type, whatever ... type PTR_T is access DESIGNATED; type T is record PTR : PTR_T; end record; end; procedure T'INITIALIZE ( OBJ : out T ) is -- The only one called with an uninitialized actual. begin OBJ.PTR := null; end; procedure T'FINALIZE ( OBJ : in out T ) is procedure FREE is new UNCHECKED_DEALLOCATION ( DESIGNATED, T ); begin FREE ( OBJ.PTR ); end; procedure T'ASSIGN ( LHS, RHS : in out T ) is begin T'FINALIZE ( LHS ); LHS.PTR := new DESIGNATED'( RHS ); end; SOME ALTERNATIVES : ------------------- An almost equivalent approach (suggested by Bob Duff as a second less uniform choice), would be to state that for these "user controlled" types, implicit initialization always happens (first), while any explicit initialization happens later (by calling T'ASSIGN). Bob gave this as an alternative to T'COPY, but now that (IMHO) the two-phased approach seems to bring great relief to code generation, it is perhaps more attractive that the T'INITIALIZE call. In the example above, you could simply discard T'INITIALIZE as access values get null by default. In case some compilers are intelligent about implicit initializations, this alternative may be cheaper. Of course, one can feel un-easy about calls to user-defined subprograms that are not easy to predict. As long as T'INITIALIZE simply sets a null value, this can't be a concern. People who define T'INITIALIZEs that do a TEXT_IO.GET_LINE are looking for non-deterministic behavior and will get what they wanted. If this un-easy feeling is yet too strong, one may dare to consider yet another approach : A compiler would be required to "zero" the memory allocated for objects that need finalization. This would give the programmer a low-level, non-portable ability to make T'FINALIZE and T'ASSIGN work with less visible initialization. REPLY TO 92-1711.a : -------------------- The rest of the comment is a detailed reply to Randy (92-1711.a). > This is true if finalization is only used to clean up memory resources. Remember that the issue was about user-defined storage allocators (92-1697.a). Those that use user-controlled features are willing to take responsibility for the correct behavior. In the first case the programmer need guarantee finalization of all heap objects. In the second case the programmer need guarantee finalization is not needed for anything but the memory. The essential factor is that I don't expect anything from the compiler. I just assume that user-defined pools are meant for programmers that are willing to take responsibility for correctness in return for more efficiency. I see some responsibility they could take, and how they could get efficiency in return, and note how the current plans refuse him that trade. > If finalization is almost never used, we A) wasted our time on it in > the first place, and B) the performance of the technique used is > irrelevant. If finalization is almost never used, and almost always *needed*, Ada9X has a problem, as the competition can validly claim they don't have this problem. I'me convinced that the current Ada83-C++ balance is to a significant extent due to the attention paid to OO in both languages. I think history may well repeat itself by paying insufficient attention to assignment/finalization control. > That is a very narrow view of the use of finalization. I intend to > use it on just about any resource that needs to be freed: > 1) On Text_IO.File_Type > 2) On JWindows.Window > 3) On SLock.File_Locks Yes, that's exactly what I mean. Text_IO.File_Type is typically implemented using a kind of reference (see 92-1594.a and follow-up). As the number of windows in a decent system is dynamic, I guess you need to use an access type there as well. My point is that if you can finalize an access type, you can finalize anything you want, as there is no longer a penalty in using access types. > You also forgot to finalize them in the normal case. Sorry. I was really focusing on the problem of exceptions. I assume this oversight is not a real problem. > Once these are corrected, your rearrangement no longer works: No, no. As above, I'me trying to illustrate how we could allow a compiler to do without this nesting by asking a higher price from the user (i.e. that he should be capable of setting a flag without requiring finalization, frankly I think he'll manage). > Probably true. However, the ACVC writers will think otherwise ACVC programs will have to obey the rules as anyone else, no ? If the language says you have to expect the compiler to generate calls to T'INITIALIZE and T'FINALIZE for locals, then ACVC tests must expect that too. > T'INITIALIZE is a user-defined procedure. It therefore can do anything, > and the language must be prepared to deal with that. How is the language prepared to deal with the T'WRITE calling (unix) _exit? Or what if it's defining an uninitialized local and use it ? Or what about : procedure T'WRITE ... is SPILL_SOME : constant ACCESS_TYPE := new DESIGATED_TYPE; LOCK_ONE : TEXT_IO.FILE_TYPE; begin OPEN ( LOCK_ONE, OUT_FILE, NAME=>"" ); GLOBAL_SEMA.LOCK; end; It is the same case as a T'INITIALIZE that requires finalization : it doesn't get any and needs some. The language can deal with this in only one way : declare it erroneous or bounded error or simply not mention it and rely on common-sense not to do the above (as it now does for SPILL_SOME above). That's exactly how I would like to deal with the case of T'INITIALIZE that requires finalization too. There are hard limits to the protection you can offer by a language. If the programmer wants to make a program fail, he can, in any event, and he can make it virtually as "hard" as he likes, using user-controlled stuff (e.g. UNCHECKED_..., pragma INTERFACE, etc). > >By the way, I checked this limit on Alsys5.5.0 on HP9000-400, and > >there it is around 550. I expect this to be sufficient. > > I don't; if it is, we've failed to define finalization usefully. If the approach above is acceptable, then this shouldn't be a problem though... > If the resulting code is too large to run on the target, the fact that > the compiler can compile it is irrelevant. That would be a common > problem on 16-bit targets if PC-mapping is not used. I assume these comments are still expecting that the nested exception handlers are required. In any event, those targets will always have inherent limitations to what an application can do. Those that want to use them will have to live with these limitations (and hence not declare too many exception handlers either). This really is a separate issue. > >As illustrated above, I wouldn't mind getting any finalizable object > >or component get INITIALIZED before the enclosing object gets > >(logically) used. > > If this worked correctly, I would not have a problem. However, it > doesn't (if T'INITIALIZE for a component raises an exception, only > some components must get finalized. This cannot be done in a thunk [a > compiler-generated subroutine], because some exception handlers would > have to be entered, but not left, during the subroutine. Sorry, I don't see this point. If a compiler would want to avoid any unnecessary call to T'INITIALIZE, he could generate elaboration "thunks" with the nested exception handlers (assuming thunks can contain handlers). By the way, thanks all of you for sending me all those "thunk" clarifications. Could you give an example of something that can't be done ? > Since you don't know how many indexes there will be As I said, above, T'INITIALIZE is called after memory has been reserved. Surely at that time you know how many indexes there are. > Particularly because the new 11.6 will probably not define the values > of variables written within handled code (while Ada 83 does), so using > temporary variable to hold the index cannot be done Could you clarify how that would invalidate the example in 92-1630.a : procedure ARRAY_T'ELABORATE ( OBJ : in out ARRAY_T ) is begin UNTIL_EXCEPTION : for I in OBJ'RANGE loop begin COMP_T'ELABORATE ( OBJ(I) ); exception when others => -- Surely I can count on I begin valid here ??? if I /= OBJ'FIRST then for J in OBJ'FIRST .. INDEX_T'PRED(I) loop COMP_T'FINALIZE( OBJ(I) ); end loop; end if; raise; end loop UNTIL_EXCEPTION; end ARRAY_T'ELABORATE; > I understand where you're coming from, but I don't agree. First, I > think most examples will use fixed finalizable objects (like > File_Type, Window, etc.) Even most 'access types' will do that -- > consider that all of your examples have done so. I'me not talking about fixed size, I'me talking about not allocating a fixed amount. E.g. Windows are typically dynamically allocated. A window can have a (linked) list of Subwindows. Per-object overhead counts for each subwindow in the list, source overhead can count only once. > Beyond that, your scheme potentially adds a lot of code overhead to a > system - probably 10 times as much as the MRT scheme. That means you > don't break even until you have 10 times more finalized objects in the > heap than you have finalization places. I think that will be rare in > practice, particularly in embedded systems (where heap use is often > forbidden). I agree with this potential problem, but I guess this issue no longer applies if you can lower the granularity of finalization as proposed above. > I understand the worst case here. However, that case will never happen > in practice. Why? Because the overhead of very small heap allocations > (and the fragmentation potential) will eat you alive in nearly any real You argue that you don't *want* it to happen. That's not enough. I know of lots of applications that use linked lists, such as : -->NEXT----->NEXT----> OBJ--> OBJ--> e.g. this is typical when OBJ is designating a limited type (e.g. File_Type), that can't get copied in the list. > That is especially true since "all those nice compiler features" > really boil down to just composition of equality and assignment. > > Assuming used-defined assignment is available, there will be no other > significant difference (initializations and aggregates would be > available where appropriate, based in some way on the user-defined > assignment). If you make that assumption, and that all "nice features" that go away by limited-ness re-appear by the feature, then of course, all would be well. But then again, what would be the difference between private types and de-limited limited types. It would just require you to add exceptions to the 83LRM restrictions on limited types. I'me just afraid that this finalization model will stand in the way for this assumption to become true. > I think this is a fine idea, but I'm not convinced it works when a type > is sufficiently complicated. For instance, it is necessary to evaluate > the value of a discriminant before you can know what components to > initialize. > You also would have to define what happens then T'Initialize > is aborted. (see above). > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. > However, I vaguely recall some problem with this idea. Tucker? I am keeping my fingers crossed. Stef. ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-30 22:00:55 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion <> !discussion .... > This code is wrong. I fear most of your reply depends on a misunderstanding I generated. The code you call wrong was based on : > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. I tried to convey that idea right after the example by "This does of course rely on nobody being shocked by a T'INITIALIZE call that happens early." I should have been much more explicit about re-using T'INITIALIZE for different semantics. Sorry. To be absolutely sure we don't have that misunderstanding again : SUGGESTION FOR PHASED FINALIZATION : ------------------------------------ I would propose to approach the finalization problem in 2-phases. This to give a lot of slack space for compilers that would have difficulty with the fine granularity of the finalization proposed in 92-1630.a. T'INITIALIZE would be a low-level initialization, completely unrelated to any explicit or implicit initialization. The programmer is informed that the compiler can call T'INITIALIZE without calling T'FINALIZE ! Therefor, T'INITIALIZE implementations that require finalization are as "wrong" as forgetting about finalization. The call T'INITIALIZE(OBJ) will guarantee that OBJ will be set to some "null" value, such that it can later be passed to T'ASSIGN and/or T'FINALIZE. A T'COPY operation (supposedly something used to assign to uninitialized objects) is always replaced by a T'INITIALIZE/T'ASSIGN combination. This is of course less efficient, but it gives the compiler freedom to group elaboration operations together, by first calling T'INITIALIZE on all stuff that needs finalization, and subsequently calling T'ASSIGN on all stuff that needs finalization and (normal, high-level) initialization. The first phase happens as if all components that need finalization also had implicit initialization, but rather than calling some "thunk", you call T'INITIALIZE. Therefor, you can't possibly be wondering if you already have your discriminants set, whether access discriminants will be set, how many components there are in arrays, whether implicit heap would be used, etc... all this is known at that time (as a direct consequence of the model). The only thing special about this first phase is that it doesn't need any finalization (by definition), but permits to finalize the complete elaborated objects/subcomponents (by definition). Both assumptions are the programmers responsibility. In practice it only means he must be smart enough to set a flag. Of course the size of these groups are limited by dependencies between declarations that need finalization, but that shouldn't be a serious problem, as a large amount of such dependencies is very rare. EXAMPLE : --------- declare OBJ : T := V1; begin ... OBJ := V2; end; becomes : declare OBJ : T; begin T'INITIALIZE ( OBJ ); -- Unrelated to the initialization, should be though of as associating -- with reserving the memory for OBJ. -- It would typically set OBJ to null. -- If this raises an exception nothing is done. begin -- OBJ will get finalized from now on. T'ASSIGN ( OBJ, V1 ); -- This implements the initialization. ... T'ASSIGN ( OBJ, V2 ); -- This implements the assignment. T'FINALIZE ( OBJ ); -- Normal exit exception when others => T'FINALIZE ( OBJ ); raise; -- Abnormal exit. end; end; T could be implemented as : package SIMPLE is type T is private; ... private type DESIGNATED is ... -- self-referential, inherently limited, class-wide and/or -- unconstrained, task or protected type, whatever ... type PTR_T is access DESIGNATED; type T is record PTR : PTR_T; end record; end; procedure T'INITIALIZE ( OBJ : out T ) is -- The only one called with an uninitialized actual. begin OBJ.PTR := null; end; procedure T'FINALIZE ( OBJ : in out T ) is procedure FREE is new UNCHECKED_DEALLOCATION ( DESIGNATED, T ); begin FREE ( OBJ.PTR ); end; procedure T'ASSIGN ( LHS, RHS : in out T ) is begin T'FINALIZE ( LHS ); LHS.PTR := new DESIGNATED'( RHS ); end; SOME ALTERNATIVES : ------------------- An almost equivalent approach (suggested by Bob Duff as a second less uniform choice), would be to state that for these "user controlled" types, implicit initialization always happens (first), while any explicit initialization happens later (by calling T'ASSIGN). Bob gave this as an alternative to T'COPY, but now that (IMHO) the two-phased approach seems to bring great relief to code generation, it is perhaps more attractive that the T'INITIALIZE call. In the example above, you could simply discard T'INITIALIZE as access values get null by default. In case some compilers are intelligent about implicit initializations, this alternative may be cheaper. Of course, one can feel un-easy about calls to user-defined subprograms that are not easy to predict. As long as T'INITIALIZE simply sets a null value, this can't be a concern. People who define T'INITIALIZEs that do a TEXT_IO.GET_LINE are looking for non-deterministic behavior and will get what they wanted. If this un-easy feeling is yet too strong, one may dare to consider yet another approach : A compiler would be required to "zero" the memory allocated for objects that need finalization. This would give the programmer a low-level, non-portable ability to make T'FINALIZE and T'ASSIGN work with less visible initialization. REPLY TO 92-1711.a : -------------------- The rest of the comment is a detailed reply to Randy (92-1711.a). > This is true if finalization is only used to clean up memory resources. Remember that the issue was about user-defined storage allocators (92-1697.a). Those that use user-controlled features are willing to take responsibility for the correct behavior. In the first case the programmer need guarantee finalization of all heap objects. In the second case the programmer need guarantee finalization is not needed for anything but the memory. The essential factor is that I don't expect anything from the compiler. I just assume that user-defined pools are meant for programmers that are willing to take responsibility for correctness in return for more efficiency. I see some responsibility they could take, and how they could get efficiency in return, and note how the current plans refuse him that trade. > If finalization is almost never used, we A) wasted our time on it in > the first place, and B) the performance of the technique used is > irrelevant. If finalization is almost never used, and almost always *needed*, Ada9X has a problem, as the competition can validly claim they don't have this problem. I'me convinced that the current Ada83-C++ balance is to a significant extent due to the attention paid to OO in both languages. I think history may well repeat itself by paying insufficient attention to assignment/finalization control. > That is a very narrow view of the use of finalization. I intend to > use it on just about any resource that needs to be freed: > 1) On Text_IO.File_Type > 2) On JWindows.Window > 3) On SLock.File_Locks Yes, that's exactly what I mean. Text_IO.File_Type is typically implemented using a kind of reference (see 92-1594.a and follow-up). As the number of windows in a decent system is dynamic, I guess you need to use an access type there as well. My point is that if you can finalize an access type, you can finalize anything you want, as there is no longer a penalty in using access types. > You also forgot to finalize them in the normal case. Sorry. I was really focusing on the problem of exceptions. I assume this oversight is not a real problem. > Once these are corrected, your rearrangement no longer works: No, no. As above, I'me trying to illustrate how we could allow a compiler to do without this nesting by asking a higher price from the user (i.e. that he should be capable of setting a flag without requiring finalization, frankly I think he'll manage). > Probably true. However, the ACVC writers will think otherwise ACVC programs will have to obey the rules as anyone else, no ? If the language says you have to expect the compiler to generate calls to T'INITIALIZE and T'FINALIZE for locals, then ACVC tests must expect that too. > T'INITIALIZE is a user-defined procedure. It therefore can do anything, > and the language must be prepared to deal with that. How is the language prepared to deal with the T'WRITE calling (unix) _exit? Or what if it's defining an uninitialized local and use it ? Or what about : procedure T'WRITE ... is SPILL_SOME : constant ACCESS_TYPE := new DESIGATED_TYPE; LOCK_ONE : TEXT_IO.FILE_TYPE; begin OPEN ( LOCK_ONE, OUT_FILE, NAME=>"" ); GLOBAL_SEMA.LOCK; end; It is the same case as a T'INITIALIZE that requires finalization : it doesn't get any and needs some. The language can deal with this in only one way : declare it erroneous or bounded error or simply not mention it and rely on common-sense not to do the above (as it now does for SPILL_SOME above). That's exactly how I would like to deal with the case of T'INITIALIZE that requires finalization too. There are hard limits to the protection you can offer by a language. If the programmer wants to make a program fail, he can, in any event, and he can make it virtually as "hard" as he likes, using user-controlled stuff (e.g. UNCHECKED_..., pragma INTERFACE, etc). > >By the way, I checked this limit on Alsys5.5.0 on HP9000-400, and > >there it is around 550. I expect this to be sufficient. > > I don't; if it is, we've failed to define finalization usefully. If the approach above is acceptable, then this shouldn't be a problem though... > If the resulting code is too large to run on the target, the fact that > the compiler can compile it is irrelevant. That would be a common > problem on 16-bit targets if PC-mapping is not used. I assume these comments are still expecting that the nested exception handlers are required. In any event, those targets will always have inherent limitations to what an application can do. Those that want to use them will have to live with these limitations (and hence not declare too many exception handlers either). This really is a separate issue. > >As illustrated above, I wouldn't mind getting any finalizable object > >or component get INITIALIZED before the enclosing object gets > >(logically) used. > > If this worked correctly, I would not have a problem. However, it > doesn't (if T'INITIALIZE for a component raises an exception, only > some components must get finalized. This cannot be done in a thunk [a > compiler-generated subroutine], because some exception handlers would > have to be entered, but not left, during the subroutine. Sorry, I don't see this point. If a compiler would want to avoid any unnecessary call to T'INITIALIZE, he could generate elaboration "thunks" with the nested exception handlers (assuming thunks can contain handlers). By the way, thanks all of you for sending me all those "thunk" clarifications. Could you give an example of something that can't be done ? > Since you don't know how many indexes there will be As I said, above, T'INITIALIZE is called after memory has been reserved. Surely at that time you know how many indexes there are. > Particularly because the new 11.6 will probably not define the values > of variables written within handled code (while Ada 83 does), so using > temporary variable to hold the index cannot be done Could you clarify how that would invalidate the example in 92-1630.a : procedure ARRAY_T'ELABORATE ( OBJ : in out ARRAY_T ) is begin UNTIL_EXCEPTION : for I in OBJ'RANGE loop begin COMP_T'ELABORATE ( OBJ(I) ); exception when others => -- Surely I can count on I begin valid here ??? if I /= OBJ'FIRST then for J in OBJ'FIRST .. INDEX_T'PRED(I) loop COMP_T'FINALIZE( OBJ(I) ); end loop; end if; raise; end loop UNTIL_EXCEPTION; end ARRAY_T'ELABORATE; > I understand where you're coming from, but I don't agree. First, I > think most examples will use fixed finalizable objects (like > File_Type, Window, etc.) Even most 'access types' will do that -- > consider that all of your examples have done so. I'me not talking about fixed size, I'me talking about not allocating a fixed amount. E.g. Windows are typically dynamically allocated. A window can have a (linked) list of Subwindows. Per-object overhead counts for each subwindow in the list, source overhead can count only once. > Beyond that, your scheme potentially adds a lot of code overhead to a > system - probably 10 times as much as the MRT scheme. That means you > don't break even until you have 10 times more finalized objects in the > heap than you have finalization places. I think that will be rare in > practice, particularly in embedded systems (where heap use is often > forbidden). I agree with this potential problem, but I guess this issue no longer applies if you can lower the granularity of finalization as proposed above. > I understand the worst case here. However, that case will never happen > in practice. Why? Because the overhead of very small heap allocations > (and the fragmentation potential) will eat you alive in nearly any real You argue that you don't *want* it to happen. That's not enough. I know of lots of applications that use linked lists, such as : -->NEXT----->NEXT----> OBJ--> OBJ--> e.g. this is typical when OBJ is designating a limited type (e.g. File_Type), that can't get copied in the list. > That is especially true since "all those nice compiler features" > really boil down to just composition of equality and assignment. > > Assuming used-defined assignment is available, there will be no other > significant difference (initializations and aggregates would be > available where appropriate, based in some way on the user-defined > assignment). If you make that assumption, and that all "nice features" that go away by limited-ness re-appear by the feature, then of course, all would be well. But then again, what would be the difference between private types and de-limited limited types. It would just require you to add exceptions to the 83LRM restrictions on limited types. I'me just afraid that this finalization model will stand in the way for this assumption to become true. > I think this is a fine idea, but I'm not convinced it works when a type > is sufficiently complicated. For instance, it is necessary to evaluate > the value of a discriminant before you can know what components to > initialize. > You also would have to define what happens then T'Initialize > is aborted. (see above). > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. > However, I vaguely recall some problem with this idea. Tucker? I am keeping my fingers crossed. Stef. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% End of returned mail ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-30 22:00:55 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion .... > This code is wrong. I fear most of your reply depends on a misunderstanding I generated. The code you call wrong was based on : > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. I tried to convey that idea right after the example by "This does of course rely on nobody being shocked by a T'INITIALIZE call that happens early." I should have been much more explicit about re-using T'INITIALIZE for different semantics. Sorry. To be absolutely sure we don't have that misunderstanding again : SUGGESTION FOR PHASED FINALIZATION : ------------------------------------ I would propose to approach the finalization problem in 2-phases. This to give a lot of slack space for compilers that would have difficulty with the fine granularity of the finalization proposed in 92-1630.a. T'INITIALIZE would be a low-level initialization, completely unrelated to any explicit or implicit initialization. The programmer is informed that the compiler can call T'INITIALIZE without calling T'FINALIZE ! Therefor, T'INITIALIZE implementations that require finalization are as "wrong" as forgetting about finalization. The call T'INITIALIZE(OBJ) will guarantee that OBJ will be set to some "null" value, such that it can later be passed to T'ASSIGN and/or T'FINALIZE. A T'COPY operation (supposedly something used to assign to uninitialized objects) is always replaced by a T'INITIALIZE/T'ASSIGN combination. This is of course less efficient, but it gives the compiler freedom to group elaboration operations together, by first calling T'INITIALIZE on all stuff that needs finalization, and subsequently calling T'ASSIGN on all stuff that needs finalization and (normal, high-level) initialization. The first phase happens as if all components that need finalization also had implicit initialization, but rather than calling some "thunk", you call T'INITIALIZE. Therefor, you can't possibly be wondering if you already have your discriminants set, whether access discriminants will be set, how many components there are in arrays, whether implicit heap would be used, etc... all this is known at that time (as a direct consequence of the model). The only thing special about this first phase is that it doesn't need any finalization (by definition), but permits to finalize the complete elaborated objects/subcomponents (by definition). Both assumptions are the programmers responsibility. In practice it only means he must be smart enough to set a flag. Of course the size of these groups are limited by dependencies between declarations that need finalization, but that shouldn't be a serious problem, as a large amount of such dependencies is very rare. EXAMPLE : --------- declare OBJ : T := V1; begin ... OBJ := V2; end; becomes : declare OBJ : T; begin T'INITIALIZE ( OBJ ); -- Unrelated to the initialization, should be though of as associating -- with reserving the memory for OBJ. -- It would typically set OBJ to null. -- If this raises an exception nothing is done. begin -- OBJ will get finalized from now on. T'ASSIGN ( OBJ, V1 ); -- This implements the initialization. ... T'ASSIGN ( OBJ, V2 ); -- This implements the assignment. T'FINALIZE ( OBJ ); -- Normal exit exception when others => T'FINALIZE ( OBJ ); raise; -- Abnormal exit. end; end; T could be implemented as : package SIMPLE is type T is private; ... private type DESIGNATED is ... -- self-referential, inherently limited, class-wide and/or -- unconstrained, task or protected type, whatever ... type PTR_T is access DESIGNATED; type T is record PTR : PTR_T; end record; end; procedure T'INITIALIZE ( OBJ : out T ) is -- The only one called with an uninitialized actual. begin OBJ.PTR := null; end; procedure T'FINALIZE ( OBJ : in out T ) is procedure FREE is new UNCHECKED_DEALLOCATION ( DESIGNATED, T ); begin FREE ( OBJ.PTR ); end; procedure T'ASSIGN ( LHS, RHS : in out T ) is begin T'FINALIZE ( LHS ); LHS.PTR := new DESIGNATED'( RHS ); end; SOME ALTERNATIVES : ------------------- An almost equivalent approach (suggested by Bob Duff as a second less uniform choice), would be to state that for these "user controlled" types, implicit initialization always happens (first), while any explicit initialization happens later (by calling T'ASSIGN). Bob gave this as an alternative to T'COPY, but now that (IMHO) the two-phased approach seems to bring great relief to code generation, it is perhaps more attractive that the T'INITIALIZE call. In the example above, you could simply discard T'INITIALIZE as access values get null by default. In case some compilers are intelligent about implicit initializations, this alternative may be cheaper. Of course, one can feel un-easy about calls to user-defined subprograms that are not easy to predict. As long as T'INITIALIZE simply sets a null value, this can't be a concern. People who define T'INITIALIZEs that do a TEXT_IO.GET_LINE are looking for non-deterministic behavior and will get what they wanted. If this un-easy feeling is yet too strong, one may dare to consider yet another approach : A compiler would be required to "zero" the memory allocated for objects that need finalization. This would give the programmer a low-level, non-portable ability to make T'FINALIZE and T'ASSIGN work with less visible initialization. REPLY TO 92-1711.a : -------------------- The rest of the comment is a detailed reply to Randy (92-1711.a). > This is true if finalization is only used to clean up memory resources. Remember that the issue was about user-defined storage allocators (92-1697.a). Those that use user-controlled features are willing to take responsibility for the correct behavior. In the first case the programmer need guarantee finalization of all heap objects. In the second case the programmer need guarantee finalization is not needed for anything but the memory. The essential factor is that I don't expect anything from the compiler. I just assume that user-defined pools are meant for programmers that are willing to take responsibility for correctness in return for more efficiency. I see some responsibility they could take, and how they could get efficiency in return, and note how the current plans refuse him that trade. > If finalization is almost never used, we A) wasted our time on it in > the first place, and B) the performance of the technique used is > irrelevant. If finalization is almost never used, and almost always *needed*, Ada9X has a problem, as the competition can validly claim they don't have this problem. I'me convinced that the current Ada83-C++ balance is to a significant extent due to the attention paid to OO in both languages. I think history may well repeat itself by paying insufficient attention to assignment/finalization control. > That is a very narrow view of the use of finalization. I intend to > use it on just about any resource that needs to be freed: > 1) On Text_IO.File_Type > 2) On JWindows.Window > 3) On SLock.File_Locks Yes, that's exactly what I mean. Text_IO.File_Type is typically implemented using a kind of reference (see 92-1594.a and follow-up). As the number of windows in a decent system is dynamic, I guess you need to use an access type there as well. My point is that if you can finalize an access type, you can finalize anything you want, as there is no longer a penalty in using access types. > You also forgot to finalize them in the normal case. Sorry. I was really focusing on the problem of exceptions. I assume this oversight is not a real problem. > Once these are corrected, your rearrangement no longer works: No, no. As above, I'me trying to illustrate how we could allow a compiler to do without this nesting by asking a higher price from the user (i.e. that he should be capable of setting a flag without requiring finalization, frankly I think he'll manage). > Probably true. However, the ACVC writers will think otherwise ACVC programs will have to obey the rules as anyone else, no ? If the language says you have to expect the compiler to generate calls to T'INITIALIZE and T'FINALIZE for locals, then ACVC tests must expect that too. > T'INITIALIZE is a user-defined procedure. It therefore can do anything, > and the language must be prepared to deal with that. How is the language prepared to deal with the T'WRITE calling (unix) _exit? Or what if it's defining an uninitialized local and use it ? Or what about : procedure T'WRITE ... is SPILL_SOME : constant ACCESS_TYPE := new DESIGATED_TYPE; LOCK_ONE : TEXT_IO.FILE_TYPE; begin OPEN ( LOCK_ONE, OUT_FILE, NAME=>"" ); GLOBAL_SEMA.LOCK; end; It is the same case as a T'INITIALIZE that requires finalization : it doesn't get any and needs some. The language can deal with this in only one way : declare it erroneous or bounded error or simply not mention it and rely on common-sense not to do the above (as it now does for SPILL_SOME above). That's exactly how I would like to deal with the case of T'INITIALIZE that requires finalization too. There are hard limits to the protection you can offer by a language. If the programmer wants to make a program fail, he can, in any event, and he can make it virtually as "hard" as he likes, using user-controlled stuff (e.g. UNCHECKED_..., pragma INTERFACE, etc). > >By the way, I checked this limit on Alsys5.5.0 on HP9000-400, and > >there it is around 550. I expect this to be sufficient. > > I don't; if it is, we've failed to define finalization usefully. If the approach above is acceptable, then this shouldn't be a problem though... > If the resulting code is too large to run on the target, the fact that > the compiler can compile it is irrelevant. That would be a common > problem on 16-bit targets if PC-mapping is not used. I assume these comments are still expecting that the nested exception handlers are required. In any event, those targets will always have inherent limitations to what an application can do. Those that want to use them will have to live with these limitations (and hence not declare too many exception handlers either). This really is a separate issue. > >As illustrated above, I wouldn't mind getting any finalizable object > >or component get INITIALIZED before the enclosing object gets > >(logically) used. > > If this worked correctly, I would not have a problem. However, it > doesn't (if T'INITIALIZE for a component raises an exception, only > some components must get finalized. This cannot be done in a thunk [a > compiler-generated subroutine], because some exception handlers would > have to be entered, but not left, during the subroutine. Sorry, I don't see this point. If a compiler would want to avoid any unnecessary call to T'INITIALIZE, he could generate elaboration "thunks" with the nested exception handlers (assuming thunks can contain handlers). By the way, thanks all of you for sending me all those "thunk" clarifications. Could you give an example of something that can't be done ? > Since you don't know how many indexes there will be As I said, above, T'INITIALIZE is called after memory has been reserved. Surely at that time you know how many indexes there are. > Particularly because the new 11.6 will probably not define the values > of variables written within handled code (while Ada 83 does), so using > temporary variable to hold the index cannot be done Could you clarify how that would invalidate the example in 92-1630.a : procedure ARRAY_T'ELABORATE ( OBJ : in out ARRAY_T ) is begin UNTIL_EXCEPTION : for I in OBJ'RANGE loop begin COMP_T'ELABORATE ( OBJ(I) ); exception when others => -- Surely I can count on I begin valid here ??? if I /= OBJ'FIRST then for J in OBJ'FIRST .. INDEX_T'PRED(I) loop COMP_T'FINALIZE( OBJ(I) ); end loop; end if; raise; end loop UNTIL_EXCEPTION; end ARRAY_T'ELABORATE; > I understand where you're coming from, but I don't agree. First, I > think most examples will use fixed finalizable objects (like > File_Type, Window, etc.) Even most 'access types' will do that -- > consider that all of your examples have done so. I'me not talking about fixed size, I'me talking about not allocating a fixed amount. E.g. Windows are typically dynamically allocated. A window can have a (linked) list of Subwindows. Per-object overhead counts for each subwindow in the list, source overhead can count only once. > Beyond that, your scheme potentially adds a lot of code overhead to a > system - probably 10 times as much as the MRT scheme. That means you > don't break even until you have 10 times more finalized objects in the > heap than you have finalization places. I think that will be rare in > practice, particularly in embedded systems (where heap use is often > forbidden). I agree with this potential problem, but I guess this issue no longer applies if you can lower the granularity of finalization as proposed above. > I understand the worst case here. However, that case will never happen > in practice. Why? Because the overhead of very small heap allocations > (and the fragmentation potential) will eat you alive in nearly any real You argue that you don't *want* it to happen. That's not enough. I know of lots of applications that use linked lists, such as : -->NEXT----->NEXT----> OBJ--> OBJ--> e.g. this is typical when OBJ is designating a limited type (e.g. File_Type), that can't get copied in the list. > That is especially true since "all those nice compiler features" > really boil down to just composition of equality and assignment. > > Assuming used-defined assignment is available, there will be no other > significant difference (initializations and aggregates would be > available where appropriate, based in some way on the user-defined > assignment). If you make that assumption, and that all "nice features" that go away by limited-ness re-appear by the feature, then of course, all would be well. But then again, what would be the difference between private types and de-limited limited types. It would just require you to add exceptions to the 83LRM restrictions on limited types. I'me just afraid that this finalization model will stand in the way for this assumption to become true. > I think this is a fine idea, but I'm not convinced it works when a type > is sufficiently complicated. For instance, it is necessary to evaluate > the value of a discriminant before you can know what components to > initialize. > You also would have to define what happens then T'Initialize > is aborted. (see above). > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. > However, I vaguely recall some problem with this idea. Tucker? I am keeping my fingers crossed. Stef. ------------- !topic LSN on Finalization in Ada 9X !from 1992-10-30 22:00:55 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion <> !discussion .... > This code is wrong. I fear most of your reply depends on a misunderstanding I generated. The code you call wrong was based on : > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. I tried to convey that idea right after the example by "This does of course rely on nobody being shocked by a T'INITIALIZE call that happens early." I should have been much more explicit about re-using T'INITIALIZE for different semantics. Sorry. To be absolutely sure we don't have that misunderstanding again : SUGGESTION FOR PHASED FINALIZATION : ------------------------------------ I would propose to approach the finalization problem in 2-phases. This to give a lot of slack space for compilers that would have difficulty with the fine granularity of the finalization proposed in 92-1630.a. T'INITIALIZE would be a low-level initialization, completely unrelated to any explicit or implicit initialization. The programmer is informed that the compiler can call T'INITIALIZE without calling T'FINALIZE ! Therefor, T'INITIALIZE implementations that require finalization are as "wrong" as forgetting about finalization. The call T'INITIALIZE(OBJ) will guarantee that OBJ will be set to some "null" value, such that it can later be passed to T'ASSIGN and/or T'FINALIZE. A T'COPY operation (supposedly something used to assign to uninitialized objects) is always replaced by a T'INITIALIZE/T'ASSIGN combination. This is of course less efficient, but it gives the compiler freedom to group elaboration operations together, by first calling T'INITIALIZE on all stuff that needs finalization, and subsequently calling T'ASSIGN on all stuff that needs finalization and (normal, high-level) initialization. The first phase happens as if all components that need finalization also had implicit initialization, but rather than calling some "thunk", you call T'INITIALIZE. Therefor, you can't possibly be wondering if you already have your discriminants set, whether access discriminants will be set, how many components there are in arrays, whether implicit heap would be used, etc... all this is known at that time (as a direct consequence of the model). The only thing special about this first phase is that it doesn't need any finalization (by definition), but permits to finalize the complete elaborated objects/subcomponents (by definition). Both assumptions are the programmers responsibility. In practice it only means he must be smart enough to set a flag. Of course the size of these groups are limited by dependencies between declarations that need finalization, but that shouldn't be a serious problem, as a large amount of such dependencies is very rare. EXAMPLE : --------- declare OBJ : T := V1; begin ... OBJ := V2; end; becomes : declare OBJ : T; begin T'INITIALIZE ( OBJ ); -- Unrelated to the initialization, should be though of as associating -- with reserving the memory for OBJ. -- It would typically set OBJ to null. -- If this raises an exception nothing is done. begin -- OBJ will get finalized from now on. T'ASSIGN ( OBJ, V1 ); -- This implements the initialization. ... T'ASSIGN ( OBJ, V2 ); -- This implements the assignment. T'FINALIZE ( OBJ ); -- Normal exit exception when others => T'FINALIZE ( OBJ ); raise; -- Abnormal exit. end; end; T could be implemented as : package SIMPLE is type T is private; ... private type DESIGNATED is ... -- self-referential, inherently limited, class-wide and/or -- unconstrained, task or protected type, whatever ... type PTR_T is access DESIGNATED; type T is record PTR : PTR_T; end record; end; procedure T'INITIALIZE ( OBJ : out T ) is -- The only one called with an uninitialized actual. begin OBJ.PTR := null; end; procedure T'FINALIZE ( OBJ : in out T ) is procedure FREE is new UNCHECKED_DEALLOCATION ( DESIGNATED, T ); begin FREE ( OBJ.PTR ); end; procedure T'ASSIGN ( LHS, RHS : in out T ) is begin T'FINALIZE ( LHS ); LHS.PTR := new DESIGNATED'( RHS ); end; SOME ALTERNATIVES : ------------------- An almost equivalent approach (suggested by Bob Duff as a second less uniform choice), would be to state that for these "user controlled" types, implicit initialization always happens (first), while any explicit initialization happens later (by calling T'ASSIGN). Bob gave this as an alternative to T'COPY, but now that (IMHO) the two-phased approach seems to bring great relief to code generation, it is perhaps more attractive that the T'INITIALIZE call. In the example above, you could simply discard T'INITIALIZE as access values get null by default. In case some compilers are intelligent about implicit initializations, this alternative may be cheaper. Of course, one can feel un-easy about calls to user-defined subprograms that are not easy to predict. As long as T'INITIALIZE simply sets a null value, this can't be a concern. People who define T'INITIALIZEs that do a TEXT_IO.GET_LINE are looking for non-deterministic behavior and will get what they wanted. If this un-easy feeling is yet too strong, one may dare to consider yet another approach : A compiler would be required to "zero" the memory allocated for objects that need finalization. This would give the programmer a low-level, non-portable ability to make T'FINALIZE and T'ASSIGN work with less visible initialization. REPLY TO 92-1711.a : -------------------- The rest of the comment is a detailed reply to Randy (92-1711.a). > This is true if finalization is only used to clean up memory resources. Remember that the issue was about user-defined storage allocators (92-1697.a). Those that use user-controlled features are willing to take responsibility for the correct behavior. In the first case the programmer need guarantee finalization of all heap objects. In the second case the programmer need guarantee finalization is not needed for anything but the memory. The essential factor is that I don't expect anything from the compiler. I just assume that user-defined pools are meant for programmers that are willing to take responsibility for correctness in return for more efficiency. I see some responsibility they could take, and how they could get efficiency in return, and note how the current plans refuse him that trade. > If finalization is almost never used, we A) wasted our time on it in > the first place, and B) the performance of the technique used is > irrelevant. If finalization is almost never used, and almost always *needed*, Ada9X has a problem, as the competition can validly claim they don't have this problem. I'me convinced that the current Ada83-C++ balance is to a significant extent due to the attention paid to OO in both languages. I think history may well repeat itself by paying insufficient attention to assignment/finalization control. > That is a very narrow view of the use of finalization. I intend to > use it on just about any resource that needs to be freed: > 1) On Text_IO.File_Type > 2) On JWindows.Window > 3) On SLock.File_Locks Yes, that's exactly what I mean. Text_IO.File_Type is typically implemented using a kind of reference (see 92-1594.a and follow-up). As the number of windows in a decent system is dynamic, I guess you need to use an access type there as well. My point is that if you can finalize an access type, you can finalize anything you want, as there is no longer a penalty in using access types. > You also forgot to finalize them in the normal case. Sorry. I was really focusing on the problem of exceptions. I assume this oversight is not a real problem. > Once these are corrected, your rearrangement no longer works: No, no. As above, I'me trying to illustrate how we could allow a compiler to do without this nesting by asking a higher price from the user (i.e. that he should be capable of setting a flag without requiring finalization, frankly I think he'll manage). > Probably true. However, the ACVC writers will think otherwise ACVC programs will have to obey the rules as anyone else, no ? If the language says you have to expect the compiler to generate calls to T'INITIALIZE and T'FINALIZE for locals, then ACVC tests must expect that too. > T'INITIALIZE is a user-defined procedure. It therefore can do anything, > and the language must be prepared to deal with that. How is the language prepared to deal with the T'WRITE calling (unix) _exit? Or what if it's defining an uninitialized local and use it ? Or what about : procedure T'WRITE ... is SPILL_SOME : constant ACCESS_TYPE := new DESIGATED_TYPE; LOCK_ONE : TEXT_IO.FILE_TYPE; begin OPEN ( LOCK_ONE, OUT_FILE, NAME=>"" ); GLOBAL_SEMA.LOCK; end; It is the same case as a T'INITIALIZE that requires finalization : it doesn't get any and needs some. The language can deal with this in only one way : declare it erroneous or bounded error or simply not mention it and rely on common-sense not to do the above (as it now does for SPILL_SOME above). That's exactly how I would like to deal with the case of T'INITIALIZE that requires finalization too. There are hard limits to the protection you can offer by a language. If the programmer wants to make a program fail, he can, in any event, and he can make it virtually as "hard" as he likes, using user-controlled stuff (e.g. UNCHECKED_..., pragma INTERFACE, etc). > >By the way, I checked this limit on Alsys5.5.0 on HP9000-400, and > >there it is around 550. I expect this to be sufficient. > > I don't; if it is, we've failed to define finalization usefully. If the approach above is acceptable, then this shouldn't be a problem though... > If the resulting code is too large to run on the target, the fact that > the compiler can compile it is irrelevant. That would be a common > problem on 16-bit targets if PC-mapping is not used. I assume these comments are still expecting that the nested exception handlers are required. In any event, those targets will always have inherent limitations to what an application can do. Those that want to use them will have to live with these limitations (and hence not declare too many exception handlers either). This really is a separate issue. > >As illustrated above, I wouldn't mind getting any finalizable object > >or component get INITIALIZED before the enclosing object gets > >(logically) used. > > If this worked correctly, I would not have a problem. However, it > doesn't (if T'INITIALIZE for a component raises an exception, only > some components must get finalized. This cannot be done in a thunk [a > compiler-generated subroutine], because some exception handlers would > have to be entered, but not left, during the subroutine. Sorry, I don't see this point. If a compiler would want to avoid any unnecessary call to T'INITIALIZE, he could generate elaboration "thunks" with the nested exception handlers (assuming thunks can contain handlers). By the way, thanks all of you for sending me all those "thunk" clarifications. Could you give an example of something that can't be done ? > Since you don't know how many indexes there will be As I said, above, T'INITIALIZE is called after memory has been reserved. Surely at that time you know how many indexes there are. > Particularly because the new 11.6 will probably not define the values > of variables written within handled code (while Ada 83 does), so using > temporary variable to hold the index cannot be done Could you clarify how that would invalidate the example in 92-1630.a : procedure ARRAY_T'ELABORATE ( OBJ : in out ARRAY_T ) is begin UNTIL_EXCEPTION : for I in OBJ'RANGE loop begin COMP_T'ELABORATE ( OBJ(I) ); exception when others => -- Surely I can count on I begin valid here ??? if I /= OBJ'FIRST then for J in OBJ'FIRST .. INDEX_T'PRED(I) loop COMP_T'FINALIZE( OBJ(I) ); end loop; end if; raise; end loop UNTIL_EXCEPTION; end ARRAY_T'ELABORATE; > I understand where you're coming from, but I don't agree. First, I > think most examples will use fixed finalizable objects (like > File_Type, Window, etc.) Even most 'access types' will do that -- > consider that all of your examples have done so. I'me not talking about fixed size, I'me talking about not allocating a fixed amount. E.g. Windows are typically dynamically allocated. A window can have a (linked) list of Subwindows. Per-object overhead counts for each subwindow in the list, source overhead can count only once. > Beyond that, your scheme potentially adds a lot of code overhead to a > system - probably 10 times as much as the MRT scheme. That means you > don't break even until you have 10 times more finalized objects in the > heap than you have finalization places. I think that will be rare in > practice, particularly in embedded systems (where heap use is often > forbidden). I agree with this potential problem, but I guess this issue no longer applies if you can lower the granularity of finalization as proposed above. > I understand the worst case here. However, that case will never happen > in practice. Why? Because the overhead of very small heap allocations > (and the fragmentation potential) will eat you alive in nearly any real You argue that you don't *want* it to happen. That's not enough. I know of lots of applications that use linked lists, such as : -->NEXT----->NEXT----> OBJ--> OBJ--> e.g. this is typical when OBJ is designating a limited type (e.g. File_Type), that can't get copied in the list. > That is especially true since "all those nice compiler features" > really boil down to just composition of equality and assignment. > > Assuming used-defined assignment is available, there will be no other > significant difference (initializations and aggregates would be > available where appropriate, based in some way on the user-defined > assignment). If you make that assumption, and that all "nice features" that go away by limited-ness re-appear by the feature, then of course, all would be well. But then again, what would be the difference between private types and de-limited limited types. It would just require you to add exceptions to the 83LRM restrictions on limited types. I'me just afraid that this finalization model will stand in the way for this assumption to become true. > I think this is a fine idea, but I'm not convinced it works when a type > is sufficiently complicated. For instance, it is necessary to evaluate > the value of a discriminant before you can know what components to > initialize. > You also would have to define what happens then T'Initialize > is aborted. (see above). > >How about doing T'INITIALIZE early and T'ASSIGN later (as above). > > I tend to agree here. > However, I vaguely recall some problem with this idea. Tucker? I am keeping my fingers crossed. Stef. %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% End of returned mail ------------- !topic LSN on Finalization in Ada 9X !from 1992-11-02 07:17:12 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference 92-1723.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion Sorry, the example in 92-1723 has a bug, the example T'ASSIGN should be : procedure T'ASSIGN ( LHS, RHS : in out T ) is begin if LHS.PTR /= RHS.PTR then T'FINALIZE ( LHS ); LHS.PTR := new DESIGNATED'( RHS.PTR.all ); end if; end; As T is passed by reference, T'ASSIGN must protect against aliases. P.S. over the weekend I have been browsing through the "Formal Studies", and in the mail regarding use of exceptions, I found Tucker very convincing with : > the extreme prejudice for executing the normal case efficiently, rather > than efficiently handling exceptions. (quote out of my head, but that was about what it said). Applying this to the issue under discussion, I think it speaks for the 2-phase mark/ {assign}finalize model, as in the absence of exceptions, it is expected to be more efficient than the finalization list approach. ------------- !topic LSN on Finalization in Ada 9X !from Robert I. Eachus 1992-11-02 09:15:42 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference 92-1723.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion !discussion Stef recommends a two-phase initialization process, with only objects that have been T'ASSIGNed to being finalized. Right idea, but I find the terminology very confusing. How about: T'ELABORATE -- actions taken to create an object, possibly -- including initialization of the descriminants and -- descriptors. After elaboration, assignment to an -- object is legal. T'INITIALIZE -- May be user defined, and occurs prior to the -- first actual assignment to a variable. T'ASSIGN -- Assignment. Assignment to an unelaborated variable -- causes PROGRAM_ERROR (if the situation can -- occur), but more importantly, assignment to an -- uninitialized variable also causes PROGRAM_ERROR -- to be raised. (Actually the raising of -- PROGRAM_ERROR will most likely occur if the body -- of a user defined subprogram requires elaboration -- and is not yet elaborated.) T'FINALIZE -- Must be called before the scope containing a -- variable of type T is left if the object has been -- T'INITIALIZED, or an pointer designating an -- object of type T is overwritten. Since for an object designated by an access value it will never be the case that anything occurs between the end of elaboration and the beginning of initialization (other than the check that T'INITIALIZE has been elaborated) the distinction between elaboration time and initialization time only matters for objects in subprograms and packages. In particular, it should be the case that initialization of objects in the package that declares a type can be delayed until after the body has been elaborated, so that it is legal to declare such objects. Incidently, I am not taking a position on whether or not users should be able to declare both elaboration and initialization routines (but in light of the above it seems unnecessary) or whether it should be possible for a user to override the elaboration checks on T'INITIALIZE. However, it will ALWAYS be the case that a call to a user written subprogram may cause PROGRAM_ERROR, even if only due to nested calls. Robert I. Eachus with Standard_Disclaimer; use Standard_Disclaimer; function Message (Text: in Clever_Ideas) return Better_Ideas is... ------------- !topic LSN on Finalization in Ada 9X !from 1992-11-02 22:00:53 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1594.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference 92-1723.a !reference 92-1733.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion > Right idea, but I find the terminology very confusing. Thank you Robert, I appreciate your support for the idea. While replying to your comment, I slowly realized that T'INITIALIZE is definitely the wrong model, it's more than the wrong term. If you have little time, please jump to CONCLUSION below. REPLY TO 92-1733.a : -------------------- > Stef recommends a two-phase initialization process, with only objects > that have been T'ASSIGNed to being finalized. I would have replaced T'ASSIGN in the above by T'INITIALIZE, as in the two-phased process, you first have a phase where a "group" of objects is given some "null" value, then there can be a sequence of assignments to (perhaps not all of) them. Whenever the scope of the "group" is left, the entire group gets finalized, those that have been T'ASSIGNED to (the "useful" work), and those that have not been used since T'INITIALIZE (the overhead). > How about: > > T'ELABORATE -- actions taken to create an object, possibly > -- including initialization of the discriminants and > -- descriptors. After elaboration, assignment to an > -- object is legal. I assume this is going to call T'INITIALIZE of the (sub)components of the object proceeding bottom-up, and ends with calling T'INITIALIZE on the complete objects, passing all by reference. > T'INITIALIZE -- May be user defined, and occurs prior to the > -- first actual assignment to a variable. I'me afraid I should never have re-used this term. Initialization is an existing Ada83 term, which is *not* related to the first phase actions, but rather to the subsequence T'ASSIGN calls. I think we would be better off with the alternative Bob Duff mentioned, simply stating that explicit initialization of objects or (sub)components of a "user controlled" type does not stop initialization of the lower-level (sub)components from taking place before... Another advantage of this alternative is that it would (still) be impossible to get an uninitialized access value. A user-defined T'INITIALIZE procedure would otherwise be passed such values, and even if I expect the least skillful programmer to know how to set an access value to null, it would feel better to keep that door closed. Otherwise the least confusion would come from something like T'NULLIFY, T'PREPARE, or something that clearly indicates some low-level context independent operation. In any case, I would prefer to stop using T'INITIALIZE. > T'ASSIGN -- Assignment. Assignment to an unelaborated variable > -- causes PROGRAM_ERROR (if the situation can > -- occur), but more importantly, assignment to an > -- uninitialized variable also causes PROGRAM_ERROR > -- to be raised. I guess you want to define what happens to the "stupid" implementation of calling T'ASSIGN (or :=) directly from the T'INITIALIZE (NULLIFY) ? Did you think of other ways of getting in such a situation ? In that case, this is yet another reason for taking Bob Duff's alternative, and make this kind of (very silly) mistake impossible. Otherwise calling such stuff erroneous (or bounded error) is sufficient, as it is just another case of the uninitialized variable access. > -- (Actually the raising of > -- PROGRAM_ERROR will most likely occur if the body > -- of a user defined subprogram requires elaboration > -- and is not yet elaborated.) The rest of your post proposes nothing particular is done for this situation, and raise PROGRAM_ERROR. That seems appropriate. > T'FINALIZE -- Must be called before the scope containing a > -- variable of type T is left if the object has been > -- T'INITIALIZED O.K. > -- or an pointer designating an > -- object of type T is overwritten. No, I don't think we want this. That would be asking for a kind of garbage collection (as you definitely don't want this to happen for all but the last reference to the designated object). If some application wants to start dynamically allocating controlled objects, then this application is responsible for calling an UNCHECKED_DEALLOCATION instance when the last reference to an allocated object goes away. This should be done by creating a (higher level) user-controlled type that encapsulates the access type to the lower-level user-controlled type. > Since for an object designated by an access value it will never be the > case that anything occurs between the end of elaboration and the > beginning of initialization (other than the check that T'INITIALIZE > has been elaborated) the distinction between elaboration time and > initialization time only matters for objects in subprograms and > packages. And also subcomponents of user-controlled types that are contained in these dynamically allocated objects. > In particular, it should be the case that initialization of objects in > the package that declares a type can be delayed until after the body > has been elaborated, so that it is legal to declare such objects. I would not consider to permit something like STATIC_OBJECT below : package USER_CONTROLLED is type T is private; ... for T'ASSIGN, T'FINALIZE, T'INITIALIZE, .... use ...; private type REFERENCE_T is ...; type T is record ENCAPSULATED_REFERENCE : REFERENCE_T; end record; STATIC_OBJECT : T := SOME_VALUE; -- Surely this must give you a PROGRAM_ERROR -- T'INITIALIZE and T'ASSIGN cannot possibly be elaborated yet. STATIC_UNDEFINED : T; -- In case we go for the alternative, this is (again) no problem, -- as the initial values would get defined without using -- T'INITIALIZE. UNCONTROLLED_STATIC : REFERENCE_T := INITIAL_REF; -- This is always an alternative. end USER_CONTROLLED; > I don't think we would want to break the sequential elaboration order > of Ada83. If you want a STATIC_OBJECT as above, you can now put it > after the T'ASSIGN body in the package body, as the ordering > restriction between basic and later is gone (as far as I know). Once more, if we take the alternative approach of changing the implicit initialization, then at least we could have an object of a controlled type that doesn't have explicit initialization : for T'ASSIGN, T'FINALIZE ... use ...; private type REFERENCE_T is ...; type T is record ENCAPSULATED_REFERENCE : REFERENCE_T; end record; STATIC_OBJECT : T; end USER_CONTROLLED Finally, the previous "programming rule" that T'INITIALIZE shouldn't do anything requiring finalization is also much more straightforward with the alternative. If we would implement T above as : type T is record ENCAPSULATED_PTR : PTR_T := new DESIGNATED; end record; Then we obviously risk spilling memory, as we don't have finalization for PTR_T, only for the encapsulated T. The programmer implementing the package body needs to get used to this distinction no matter what. So, in this case we need less "guideline" to the whole approach of user control. CONCLUSION : ------------ In view of the above, I would propose we forget about the user-defined T'INITIALIZE, and adapt the rules for implicit initial values : 3.2.1(6) If the object declaration includes an explicit initialization ADD>>> and assignment is representational (express it as you like), the initial value is obtained .... Otherwise any implicit initial values for the object or for its subcomponents are evaluated. So, this means that when assignment is no longer representational, the default initial values are still evaluated, as it is no longer true that they will be subsequently erased by the assignment of the explicit initial value. Considering that some (sub)components may be of a user-controlled type, evaluation of the implicit initial values must be done bottom-up, starting from a frontier of non-user-controlled types. The rest of 3.2.1 (10-13) seems to confirm this decision. E.g. as the user-defined assignment is not going to define the values of nested tasks, these must definitely get there value implicitly, as they would have in the absence of user-defined assignment. The only drawback that Bob Duff mentioned about this alternative was that it was non-uniform. I think in view of all the advantages, we can live with that. Is anyone seeing some other drawbacks ? Until they get reported, I propose we think only in terms of T'ASSIGN and T'FINALIZE, knowing that any implicit initialization of T will always take place, and that the programmer should exploit this to implement a T'ASSIGN and T'FINALIZE operation that both interpret this implicit initialization as "first time use". > Robert I. Eachus Stef. ------------- !topic LSN on Finalization in Ada 9X !from R.R. Software (Randy Brukardt) 1992-11-06 22:45:09 <> !reference MS-7.4.6();4.6 !reference 92-1165.a !reference 92-1334.a !reference 92-1628.a !reference 92-1630.a !reference 92-1638.a !reference 92-1675.a !reference 92-1680.a !reference 92-1682.a !reference 92-1697.a !reference 92-1711.a !reference 92-1723.a !reference 92-1733.a !reference 92-1738.a !reference LSN-21 !reference LSN-1046 !keywords controlled types, finalization !discussion I've let this discussion get away from me a bit. Let me point out some issues that have been touched on, but not properly addressed. First of all, I think the semantic model we are working on is a good one. The implementation could be as the MRT proposes (with a special tagged type), or as Stef has proposed (with special attributes). That does not matter as much (to me) as getting a useful user-defined assignment. The side benefit of getting a coarser-grained finalization is a bonus. I am pretty certain that the need exists both for the 'Elaborate' and 'Initialize' routines. Stef makes the need clear when he says that 'Elaborate' must not do any actions that require finalization. That's necessary for the model to work. However, it is equally necessary that the default initialization be able to do such actions (such as seizing a device or lock). While these actions can be done with an explicit initialization, requiring that sometimes breaks the abstraction. Whether the need for a user-defined 'Initialize' routine outweighs the added additional complexity I cannot say, but having one should not change the semantics at all. In this, I'm assuming that the 'Elaborate' routine is user-defined. That seems necessary in order to assure that the created object can be finalized (although that finalization ought to have nothing to do). For instance, if the object includes a reference count, that will need to be initialized appropriately. The important point here is that the actions in this routine must be done so the finalize routine can be called on the object. This requires more than default initialization for some abstractions (setting a pointer to null is not enough). The key to making Stef's idea work is to insure (by fiat) that those routines never do anything that must be finalized - the language rule being that they must not expect their results to be finalized into all objects in the scope have had their Elaborate routines called. (For non-finalized objects, Elaborate does nothing but the basic allocation of memory). However, there still is a problem with this idea, and it occurs because Ada requires the initializing expression to be evaluated before an object's discriminants are known, meaning that the order we want to use is impossible without causing compatibility problems. We ran into this problem in our Ada 83 compiler, which attempted to use a very similar scheme for allocating memory for dynamically constrained objects. The problem example is: Type Rec (A : Integer := Func) Is Record Str : String (1 .. A); End Record; A_Rec : Rec := Complex_Expression_Stuff; In this case, we have to get the discriminants from Complex_Expression_Stuff. But the possibility of Complex_Expression_Stuff raising an exception (or needing finalization in Ada 9x) exists. In addition, we want to 'elaborate' all objects together, then do the initialization, but the evaluation is out of order. There is a solution to this problem, which is to 'elaborate' using the default discriminants, and then 'assign' the default expression. That would work, except that Ada 83 explicitly makes it illegal, and there are in fact Ada 83 ACVC tests which test that a compiler does not do precisely this. (Those tests work by insuring that 'Func' is not called in the example above). Such a rule change for finalizable (inherently limited?) types would be in keeping with the other initialization rules. Otherwise, we could just disallow discriminants on such types (but attempts to do such things in past DR meetings have been very unsuccessful). Stef has repeatedly said that he wants types with user-defined assignment to be non-limited. However, this causes various semantic problems, the worst of which is a generic contract model violation of the worst kind. The easiest example of that occurs because user-defined assignment does not compose: (Assume in this example that limited-with-user-defined-assignment is non-limited for generic matching purposes) Generic Type Priv Is Private; Package Pack Is Procedure Operation(P : Priv); End Pack; Package Body Pack Is Type A_Structure Is Record Obj : Priv; -- Other components End Record; -- Type A_Structure has assignment only if Priv has predefined -- assignment. Struct : A_Structure; Procedure Operation (P : Priv) Is A : A_Structure; Begin A.Obj := P; -- OK. ... Struct := A; -- Arrgh! Only legal is Priv has predefined -- assignment. End Operation; End Pack; This means that limited-with-assignment cannot match non-limited generic parameters. It would be possible to patch up the model (with appropriate composition and other operations) to make this work, but doing so would guarantee that we have to wait another 10 years or more to get user-defined assignment. However, it does look very useful to have a limited-with-assignment generic formal, which would match both limited-with-assignment and non-limited types. (The rules would be similar to what happens when a numeric type is used as a generic private parameter vis-a-vis extra operations in the instantiation). Such a type is all that is really needed in many generic private parameter cases (assignment of the type itself is needed, not of types composed from it), and it is likely that most existing generics would be changed to use it. Having such a parameter would seem to eliminate most of Stef's real concerns about making all finalizable types limited, in that about the only thing lost is composability. In summary: Adding user-defined assignment to the finalization model allows a coarser granularity of finalization; Requires allowing evaluating default expressions (including default discriminants) always (and such a model makes more sense anyway, given that there is no requirement that the assignment copy all of the components); Requires either full composition of operations, or leaving all finalizable types limited (and that possibility probably needs a new kind of generic formal) The key to keeping the semantic and user complexity load reasonable is to insure that a user-defined assignment never assigns into a uninitialized object. It's too bad this sensible rule cannot be retrofitted onto Ada 83, because it would make correct implementation much easier. Randy. ------------- !topic PROGRAM_ERROR raised with FINALIZE raises exception !from Bevin Brett 1992-06-04 10:27:19 <> !reference MS-7.4.6();4.6 !keywords finalize, initialize, controlled !discussion I agree that PROGRAM_ERROR is the right exception to raise. I think it should be raised at the end of the scope being finalized, rather than at the target of the exit/goto. This would (a) easier to implement, and (b) make it possible for code like... begin declare X : EXTENSION_OF_SYSTEM_FINALIZED; begin ... goto LABEL; ... end; exception when PROGRAM_ERROR => ... end; <