!topic LSN on the With-private Proposal in Ada 9X !key LSN-1036 on the With-private Proposal in Ada 9X !reference MS-10;4.6 !reference MS-8.4;4.6 !reference RM83-7.4.2(7) !reference LSN001 !from Bob Duff $Date: 92/10/08 16:46:29 $ $Revision: 1.3 $ !discussion This Language Study Note discusses the with-private proposal, and related proposals, whose purpose is to allow more programmer control over access to private information. At the Frankfurt WG9 meeting, the concensus was that child library units should be included in Ada 9X, with their hierarchical namespace, as proposed by the MRT. ISO also requested that the MRT study the "with-private" proposal, which would require a library unit to state that it wants access to another library unit's private part. Based on this analysis, the MRT proposed, in MS-8.4;4.6, a similar feature called "use-private". However, we have since discovered some serious problems with our use-private proposal. This LSN discusses several alternate proposals: Proposal 1: Structural Control over Visibility. (This is the pre-version-4.6 proposal.) Proposal 2: With-private can refer to any library unit. Proposal 3: With-private can only refer to ancestor library units. Proposal 4: With-private can only refer to immediate parent library unit, and can only appear on the declaration of a library unit. Proposal 5: Four kinds of library units. Before we discuss the details of these proposals, we need some background on the child unit proposal in general. The following discussion uses our current terminology, from the ILS, but most of it should be familiar to those who have read MS;4.6. THE CHILD LIBRARY UNIT PROPOSAL: Each library unit has a parent package. The parent of a root library unit is STANDARD. Thus, normal Ada 83 library units are children of package STANDARD. A library unit occurs immediately within its parent's declarative region, and it occurs AFTER its parent's declaration. That's important -- keep it in mind as we discuss the alternatives. (Note: Ada 83 implied that library units were nested within the declaration of STANDARD (or maybe the body -- it wasn't clear). In Ada 9X, they are in the declarative region of STANDARD, but appear AFTER the declaration of STANDARD. This is a subtle point, not related to the current issue. I mention it only to allow people to picture where the pieces of a program appear.) The other rules of the language follow from this nesting concept. Note that it doesn't make sense to talk about "a child library unit" as opposed to some other kind of library unit. All library units are child library units -- they all are children of some parent. The only thing that doesn't have a parent is STANDARD itself. The tree of library units rooted at a given library unit is often called a "subsystem". There are two kinds of library units: visible and private. A visible library unit is visible outside of its parent, whereas a private library unit is not. (Remember what "outside its parent" means, in relation to the above-mentioned nesting rule.) If X is "visible outside its parent", then library units outside the subsystem can say "with X;". All of the above has been agreed upon. What is at issue here is whether there should be some feature that allows additional control over visibility, and what that feature should look like. THE EXTRA OPERATIONS ISSUE: This issue comes up in the discussion of several of the proposals, so we discuss it in general here. Consider the following Ada 83 example: package P is type T is limited private; package Q is type A is array(Natural range <>) of T; end Q; private type T is new Character; end P; In Ada 83, there are two views of type T. We call them the partial view and the full view. The partial view is limited private. The full view is scalar, and thus has additional operations, such as ":=", "=", and "<". Type A also has ":=", "=", and "<" operations. RM83-7.4.2(7) says that these extra operations are implicitly declared "at the earliest place within the immediate scope of the composite type and after the full type declaration." In this case, that earliest place is in the body of package Q. The RM83-7.4.2(7) rule needs to be changed in Ada 9X, however. It needs to handle child units. The difference between the above example and a child unit is that a nested package appears BEFORE the private part of its containing package, whereas a child unit appears AFTER the private part of the parent package. Thus, "after the full type declaration" is not appropriate for Ada 9X. The exact wording of the new rule depends upon which of the proposals is chosen. THE WHO-CAN-WITH-WHO ISSUE: It is essential that any proposal prevent the "re-export" of private information. For example, the following should NOT be allowed: package P is end P; private package P.Q is type T is range 1..10; end P; with P.Q; -- Illegal? package P.R is subtype S is P.Q.T; -- Better be illegal! end P.R; If the above were legal, it would be impossible to tell what parts of the code depend on the implementation details of the subsystem, because anybody could "with P.R" and start using subtype S. The rules needed to prevent such things depend on the proposal. Now, we turn to the details of the individual proposals. PROPOSAL 1: STRUCTURAL CONTROL OVER VISIBILITY. In this proposal, there is no addition feature for controlling visibility. Instead, the user controls visibility by controlling the structure of the program, as explained below. Library units are often defined in terms of their parent's private information (i.e. information defined in the parent's private part). Thus, the visibility rules allow a library unit to see its parent's private information. There is one exception: we cannot allow a visible child to re-export its parent's private information to clients of the subsystem. Therefore, the visible part of a visible library unit can NOT see its parent's private information. This issue does not arise for private library units, since they are not exported from their subsystem; hence, the ENTIRE private library unit can see its parent's private information. Here's an example: package P is type T is private; private type T is new Integer; end P; package P.Visible_Child is -- Here, we can NOT see the full declaration of T. <--------- private -- Here, we can see the full declaration of T. end P.Visible_Child; package body P.Visible_Child is -- Here, we can see the full declaration of T. end P.Visible_Child; private package P.Private_Child is -- Here, we can see the full declaration of T. private -- Here, we can see the full declaration of T. end P.Private_Child; private package body P.Private_Child is -- Here, we can see the full declaration of T. end P.Private_Child; Now, how do we control visibility of private information? In Ada 83, the programmer controls visibility by controlling the nesting of program units and by using with_clauses. A with_clause provides visibility of the name of a library unit. Beyond that, all visibility rules follow from the implied nesting within STANDARD. In particular, once you're inside a program unit, any nested program units inherit their visibility from the outside. No feature is provided to PREVENT the visibility from being inherited. This proposal behaves the same way: visibility rules follow from the implied nesting of children within the declarative region of STANDARD or other parent packages. Suppose, in Ada 83, you have a package that exports a private Dynamic_String type, and you wish to add a new function that manipulates a Dynamic_String. You have a choice: If the function needs access to private information, put it in the package. If not, do not put it in that package. The choice is the same here. However, child packages can make the structure of the subsystem clearer: package Some_Subsystem is ... end Some_Subsystem; package Some_Subsystem.Dynamic_Strings is type Dynamic_String is private; ... -- operations that need access to private information private type Dynamic_String is ... -- some implementation details end Some_Subsystem.Dynamic_Strings; with Some_Subsystem.Dynamic_Strings; package Some_Subsystem.Dynamic_String_Utilities is ... -- operations that do not need access to private information end Some_Subsystem.Dynamic_String_Utilities; Of course, if Dynamic_String_Utilities doesn't exist, you might want to create it. The whole point of the child library unit proposal is that you CAN create it without recompiling the world. If the above structure already exists, you may or may not wish to reduce recompilation costs by adding further children to the above instead of physically nesting the new function. But the reduction of recompilation costs is not the issue we're discussing in this LSN. Note that no part of Some_Subsystem.Dynamic_String_Utilities can see the implementation details of Dynamic_String. Thus, there is a way in Ada 9X to achieve the necessary control over visibility. Note that if Dynamic_String_Utilities is needed only within the subsystem, it can be made private. Note also that Some_Subsystem has no private part; thus the question of Dynamic_String_Utilities seeing that private part does not come up. Again, this was the programmer's choice. Note that in Ada 83, if the new function is needed only within the Dynamic_Strings package, but the new function does not need access to the visible part (or body) of Dynamic_Strings, then the programmer is faced with a dilemma -- if the new function is placed inside Dynamic_Strings, it has too much visibility on other things, but if it is placed outside Dynamic_Strings, then other things have too much visibility on the function. The only way to achieve the desired visibility in Ada 83 is to restructure everything with extra nested packages. The situation in Ada 9X is the same, except that child packages can be used instead of physically nested packages. One potential counter-argument to the above sort of example goes as follows: What if I'm handed a Dynamic_Strings package as a root library unit, written by somebody else, and I can't (or don't want to) change it? (Perhaps I don't even have access to the source.) But I want to add something to its subsystem. In this case, the argument goes, it is not feasible to restructure the package as above, with a parent, and one or more siblings like Dynamic_String_Utilities. This argument is refuted as follows: Suppose the new function I wish to add needs access to the implementation details. Then, make it a (visible) child of Dynamic_Strings. On the other hand, suppose it does not need such access. Then, do not make it a child. Again, the programmer has complete control. But what if the child is only needed WITHIN Dynamic_Strings? Well, in that case, I am obviously CHANGING some portion of Dynamic_Strings, so I can't complain that restructuring is impossible. I can choose to restructure or not, depending on how important the particular visibility issues are in the particular case. Of course, it is always possible to create a poor design. (We have carefully preserved that property in Ada 9X :-) Another mechanism for controlling visibility is the nested package. Suppose I wish to ensure that nobody can ever see a certain private part. I do it like this: package Container is package P is type T is private; private type T is new Integer; end P; end Container; No matter which proposal we choose, there is no way somebody can add a new library unit that will have access to the full declaration of type T. Only the body of Container.T has that visibility. (Note that T is still at library level.) What about the who-can-with-who issue? The escape of private information is prevented by this rule: "If the library unit mentioned in a with_clause is private, then its parent must be an ancestor of the compilation_unit being defined. Furthermore, the declaration of a visible child library unit must not have a with_clause that mentions one of its private siblings." (A private child declaration can depend on one of its sibling declarations, even if that sibling is private. The body of a child (visible or private) can also depend on a private sibling declaration.) What is the effect of this proposal on the extra operations rule? For this proposal, the rule is, "...at the earliest place that is within the immediate scope of the composite type and where the full type declaration is visible (not necessarily directly visible)." Consider the following Ada 9X example: package P is type T is limited private; private type T is new Character; end P; package P.Q is type A is array(Natural range <>) of T; private -- The extra operations (e.g. ":=", "=", and "<") -- are declared here, because this is the earliest place -- where we are within the scope of type A, and -- the full_type_declaration for T is visible. end P.Q; private package P.R is type A is array(Natural range <>) of T; -- The extra operations (e.g. ":=", "=", and "<") -- are declared here, because this is the earliest place -- where we are within the scope of type A, and -- the full_type_declaration for T is visible. private end P.R; For the rule to work correctly, we presume an empty implicit private part in every package that doesn't have an explicit private part. Otherwise, the extra operations would be in the body of the child sometimes. Adding a private part would make them suddenly move up to the package declaration -- a confusing behavior indeed. With the implicit private part rule, the extra operations are always in the package declaration. Note that this proposed rule has the the same effect as the RM83 rule on all Ada 83 programs. In addition, it has the expected effect on new Ada 9X programs: the extra operations of A in the above children are never exported to a place where only the partial view of T is visible. PROPOSAL 2: WITH-PRIVATE CAN REFER TO ANY LIBRARY UNIT. In this proposal, the with_private_clause feature is added. The user may write "with private X;" in the context_clause of Y. This means that Y has visibility on the private part of X. There are no restrictions on what X can be, other than the normal restrictions on with_clauses. This proposal allows any library unit to reach into another library unit and look at implementation details. What about the who-can-with-who issue? Perhaps a rule such as "If X withs-private Y, then anything that with's X must also with-private Y" would help. I'm not exactly sure what the rule should be. (See LSN001 for more on this point.) The extra operations rule becomes more complicated in this proposal -- but the issues are similar to those raised by Proposal 3, and are discussed there. PROPOSAL 3: WITH-PRIVATE CAN ONLY REFER TO ANCESTOR LIBRARY UNITS. In this proposal, a library unit is only allowed to with-private an ancestor. If the user says with_private, the visibility is similar to that of Proposal 1. Otherwise, visibility is more restricted. This proposal is pretty much the same as the use-private proposal of MS;4.6, except that with_clauses appear at the top of the library unit, whereas use_clauses can appear deeply nested. Since nobody seems to like the deeply-nested functionality, and since the use-private proposal is at least as complicated as this Proposal 3, we do not discuss the use-private proposal further. This proposal has a major software-engineering benefit over Proposal 2: if a piece of code needs access to a private part, then it is properly programmed as a part of the subsystem owning that private part. With Proposal 2, the programmer is tempted to just reach into another subsystem and access private information. With this proposal, the programmer must add a new operation to the other subsystem. What about the who-can-with-who issue? The rule of Proposal 1 is still needed. I believe several additional rules are needed, but I'm not sure what they should be. What is the effect on the extra operations rule? We are not sure. Consider the following example: package P is type T is limited private; private type T is new Character; end P; package P.Q is type A is tagged record X: T; -- limited component makes A limited end record; private end P.Q; with private P; package body P.Q is -- Are the extra operations of A declared here? end P.Q; The type A becomes non-limited at some point, since at some point we find out that T is non-limited. But where? It cannot happen in the declaration of P.Q, because P.Q is not supposed to know that T is non-limited. But if it happens in the body of P.Q, then extra dispatching operations ("=" and ":=") are added. But this violates a fundamental principal of the OOP design -- that the type descriptor, which contains pointers to the dispatching operations, can be laid out at compile time (of the unit containing the tagged type declaration). Now, suppose we add another child of P: with private P; with P.Q; package P.R is -- Are the extra operations of A declared here? end P.R; This package knows that T is non-limited. Therefore, it ought to know that type A is non-limited -- that would be the behavior analogous to Ada 83. Does this mean that extra operations can get declared in totally unrelated packages? Does it mean that there can be many declarations of the "same" extra operations? In the case (as here) of tagged types, what happens to the type descriptor? Consider also the following example: package P is type T is private; private type T is new Character; end P; private package P.Q is type A is array(Natural range <>) of T; end P.Q; with private P.Q; private package P.Q.R is -- What's visible here? end P.Q.R; There is a "<" operator for type A declared (somewhere). It would be very strange if P.Q.R can see the "<", but cannot see the full declaration of T, which is what caused the "<" to appear. Similar issues arise if there is a with_private_clause on the body, but not the declaration, of a library unit. Or if there is a with_private_clause on a subunit, but not its parent unit. In any case, whatever the extra operations rule is, for upward compatibility, it must be written so as to have identical semantics to RM83-7.4.2(7) in the case of existing programs. PROPOSAL 4: WITH-PRIVATE CAN ONLY REFER TO IMMEDIATE PARENT LIBRARY UNIT, AND CAN ONLY APPEAR ON THE DECLARATION OF A LIBRARY UNIT. This proposal is an attempt to solve the extra-operations anomalies of Proposals 2 and 3. The rules are something like this: - The name in a with_private_clause must denote the (immediate) parent of the current library unit. - A with_private_clause is transitive: it gives visibility on the private parts of all ancestors (not just the (immediate) parent). Absence of a with_private_clause turns off such visibility. (Or alternatively, there's no such implication, but you have to mention ALL of your ancestors in the with_private_clause, which amounts to the same thing.) - A with_private_clause is not allowed to appear on the body of a library unit. That is, it must only appear on the declaration of a library unit. (Except that subprogram bodies acting as declarations complicate this rule.) - A with_private_clause is not allowed to appear on a subunit. The extra-operations rule can be the same as in Proposal 1. What about the who-can-with-who issue? We need the rule of Proposal 1, plus some additional ones: If a given library unit has a with_private_clause, and a sibling library unit withs the given library unit, then the sibling library unit must also have a with_private_clause. If a library unit has a with_private_clause, then its children must have a with_private_clause. I'm not sure I've covered all the necessary rules here. The wording of such rules needs to handle the case of a subprogram_body that acts as a declaration. PROPOSAL 5: FOUR KINDS OF LIBRARY UNITS. In Proposal 4, what looks like a general feature (a particular kind of with_clause) is actually only allowed in one particular place, and is only allowed to refer to one particular thing. That is, the with_private_clause only imparts a single bit of information about a given library unit -- can it, or can it not, see its ancestors' private parts? This would enhance Ada's reputation as a language with all kinds of silly restrictions. This proposal solves that problem by giving the single bit of information as a single piece of syntax. Instead of two kinds of library units (visible and private), proposal 4 really has 4 kinds of library units: - A child that is visible outside its parent, and whose implementation can see its parent's private information. - A child that is visible outside its parent, and cannot see its parent's private information. - A child that is not visible outside its parent, and whose visible part and implementation can see its parent's private information. - A child that is not visible outside its parent, but whose visible part and implementation can not see its parent's private information. Since we do not like to add new reserved words without good reason, we consider calling the above four kinds of children: - visible - limited visible - private - limited private respectively. A possible syntax would be to allow the reserved word 'limited' at the beginning of a library unit declaration. For example, to declare a limited private child, then instead of this: with Text_IO; with private A.B; private package A.B.C is ... one would write this: with Text_IO; limited private package A.B.C is ... Thus, we have limited the with-private functionality to the bare minimum necessary. Note that this proposal is isomorphic to Proposal 4; the semantic issues are the same. However, some of the rules become easier to state: For example, it should be illegal for a limited private library unit to 'with' one of its (non-limited) private siblings. The wording of such rules needs to handle the case of a subprogram_body that acts as a declaration. Presumably, we need to make it illegal for a limited private library unit to have a non-limited child. COMPARISON OF PROPOSALS: Proposal 1 is clearly the simplest -- both semantically, and for implementers. In Proposal 1, one controls visibility in the Ada 83 way -- by structuring one's program to reflect the desired visibility. Proposals 2, 3, 4, and 5 add another feature for controlling visibility. Clearly, Proposal 5 is the simplest of these. Therefore, the issue boils down to this question: Is the added complexity of Proposal 5 worth the added control, or is it sufficient to be able to control visibility by the structure of the program? Are the visibility rules of Ada complicated enough, or should we add some more? Note that none of these proposals provides security. No matter what the language rules are, there is always a way for the programmer to violate abstractions. There is nothing IN THE LANGUAGE to prevent the programmer from changing and recompiling any piece of source text whatsoever. The only thing the language rules can hope to achieve is to make it clear from the source text what parts of the program have visibility on what other parts of the program. Security remains the job of environment tools. In particular, environment tools can restrict who is allowed to recompile what. In Proposal 1, an environment tool might restrict who is allowed to add children to what. In Proposal 5, an environment tool might restrict who is allowed to add non-limited children to what. In Proposals 2, 3, and 4, an environment tool might restrict who is allowed to refer to what in a with_private_clause. But in all of these proposals, if security is desired, some environment tool must restrict who is allowed to do what. Although Proposals 2, 3, and 4 seem like a huge bag of worms, Proposal 5 is probably simple enough that, given enough time, the MRT can do the necessary analysis, and define the necessary set of rules accurately. Is it worth it?