======= LSN001.HierLib ======= LSN on Hierarchical Libraries (HL) The purpose of this LSN is to explore ways of reducing recompilation by eliminating monolithic packages. The MRT has proposed a solution involving hierarchical program libraries, which has generally met with approval. There are, however, other approaches which can be explored. We will also take this opportunity to develop examples that can be used to illustrate the feature. This LSN will also discuss the value of a hierarchical name space independent of the recompilation issue. The ultimate problem that we are trying to solve is that of reducing the need for recompilation. One could argue that this is an environment and not a language issue, or that a fast compiler is the best solution to the problem. Obviously an environment which can perform minimal recompilation based on a fine analysis of the source is ideal, and fast compilers (if they generate acceptable code) are better than slow compilers. The Ada 9X process can't really do much in the way of providing either of these solutions to users. What it can do is allow programs to be designed and built in a way that avoids frequent massive recompilation. Ada already has most of the necessary features. Separate compilation of specification and implementation is the critical mechanism for reducing compilation. Subunits are a second-order mechanism which allows the implementation of an abstraction to be broken up into separately compilable units. There are two features missing in Ada '83, which would allow even more flexibility in implementing large abstractions. 1) The ability to break up a specification into separately compiled units. One barrier to accomplishing this is the visibility of private declarations. A mechanism is required to make private declarations visible to multiple packages. A second barrier is the inability to add derivable operations to a type outside of the immediate scope of the type declaration. This is mostly an Ada 9X issue (caused by the increased emphasis on derived types). The LSN titled "Types, Classes and Operations" discusses this aspect of the problem. 2) The ability for the bodies of multiple (presumably related) packages to share declarations, without making these declarations universally available. First we will discuss possible solutions to providing wider visibility to private declarations. We can classify the possible solutions into three categories. 1) Allow packages to request visibility on another package's private declarations via some kind of super "with" (for example "with all X;" or "with private X;"). 2) Allow a package to nest within another package without an explicit stub in the parent. 3) Allow a package to nest within several other packages, without an explicit stub in the parent. Solution 1 is troubling. It appears that there is no point in providing private types if their characteristic feature can be trivially ignored. There is an argument, however, that the protection afforded by private types has no place in the language, and should be enforced by configuration management (CM) policies. A special "with" clause is the most flexible solution given that attitude. On the other hand one can argue that unplanned extensibility has no place in a language designed with readability as a fundamental concern, and therefore the correct "fix" to solution 1 is that a package must name all the other packages which may view its private declarations. This is the approach C++ has taken with "friend" classes, but this approach inverts Ada's usual client/server philosophy. A type doesn't know its users, a task doesn't know its callers, a library unit doesn't know who depends on it, and generics don't know their instantiators. So why should a package know it's "extenders"? [A Side Note: With almost any language issue, one can claim that a sufficiently powerful programming environment can solve deficiencies in the language. As we pointed out earlier, the Ada 9X process cannot solve a problem by assuming a sufficiently powerful environment. This is not to say that every problem must be solved by the language. There are certainly situations where intervention by the language is a bad thing, and the solution really is best left outside the language. Private types are clearly not a bad thing (although some have argued otherwise), and the language should continue to support them. Solution 1, by virtue of the fact that it provides some hooks for an environment to work with, and allows for a simple formulation of a policy, recognizes the validity of the private type concept, but does not enforce it in the language.] Solution 2 is not perfect either. For one, it is constraining; it forces users to implement systems hierarchically, where a hierarchy may not be necessary. Secondly, it cannot be used in a natural way to create a package which extends two existing independent packages simultaneously. On the other hand, it is a natural extension to Ada. The concept of nesting is already present in the language, and the connection between visibility and nesting is already clear in the language. Subunits already exist, and hierarchical library solution is merely a variation of the subunit mechanism. Like the "with private" solution, the hierarchical solution also violates the privacy of private types. This could be changed ala the "friend" approach, by requiring some type of minimal stub in a package identifying the possible children, or by environment enforced CM policies. In the hierarchical approach, package names immediately identify the packages with "extra" visibility, and there is a strong case that maintainability of a system is retained (without a fancy environment) by this proposal, even if readability is somewhat damaged. Solution 3 combines the best features of both, but is also more complex than either. No one has proposed this so we will outline the solution. Imagine that a new symbol, say '&', was allowed in a library unit name. Then the name of a package which extends both P1 and P2 could be P1&P2.Extensions. Obviously there are a lot of details that would have to be worked out, so that P1&P2.Foo.Bar isn't ambiguous, and ways of differentiating the two possibilities are provided. The characteristics of such a proposal are that extensions are possible without the imposition of hierarchy. Nesting remains the principal controller of visibility in the language, and unit names document which units have "extra" visibility. Given the additional complexity of this solution it is not really a serious contender, but we will include it in the discussion to document that it has been considered. There is also, as always, the possibility of doing nothing. COMPARISON OF ALTERNATIVES -------------------------- First I will compare alternatives, and then I will present a few of the motivating examples, and show how these can be solved with each proposal. The usual criteria apply. We need to evaluate the proposals on the basis of how they solve problems, how they cause problems, their impact on implementations, and their impact on the complexity of the language. We label the solutions as follows: Voyeur : A new kind of with clause that provides a view of private declarations. (Thanks to Dave Emery for the name.) This is Robert Dewar's "with private" solution. Friend packages: Voyeurism allowed only between "consenting" packages. This corresponds to the C++ mechanism. Hierarchical Libraries (HL): The mapping team solution minus the private package concept (outlined as solution 2 above). Multiple Nesting (MN): This is similar to HL, except that packages may be nested in more than one package (outlined as solution 3 above.) Problems Solved: Voyeurism and multiple nesting are able to solve more problems than hierarchical libraries. Robert Dewar has provided an example of two packages which implement similar abstractions, with the need to provide a new package that implements conversions between the two types. Only Voyeurism and MN can solve the problem without requiring recompilation. If recompilation is allowed, then both HL and friend packages can solve the problem. The friend approach cannot solve any extension problem without recompilation, whereas the HL approach can be used to extend a single package. SCORE: Voyeurism: 1; MN: 1; Friends: -1; HL: 0. ----------------------------------------------------------------------- Problems Caused: Voyeurism damages the language's intrinsic maintainability (i.e. the ability to maintain a program without a fancy environment), since it is impossible to easily detect which packages have "extra" visibility. We believe that this sort of extra visibility is an important characteristic of a system, and cannot be ignored by a maintainer. The other approaches do not damage maintainability to the same extent, since fancy environment support is not required to detect the "extra" visibility. (We don't consider the ability to list the packages in a library by name as fancy environment support.) SCORE: Voyeurism: -1; MN: 0; Friends: 0; HL: 0. Hierarchical Libraries may require a program to be structured in ways that cause unwanted "extra" visibility, or extra levels of indirection (see Matrix/Vector examples below). This 'false' structuring of packages can obscure the intent of the program, making it less readable, and less maintainable. SCORE: Voyeurism: 0; MN: 0; Friends: 0; HL: -1. ----------------------------------------------------------------------- Implementability: Voyeurism, and HL all seem about the same to implement. Friends requires an extra check, and that is insignificant, but in order for it to be of equal value to the other proposals it also requires that compilers be smart about recompiling when the only change is the addition of a new friend, and this would have a major impact on many implementations. HL requires library support for a hierarchical name-space, but with the relaxation of the subunit naming restriction, this should not be a significant difficulty. It's possible that by keeping nesting as the sole aspect of the language which controls private visibility, there will be fewer difficulties retrofitting HL into existing compilers, than there would be for voyeurism, but then again maybe not. It certainly can't be any more difficult. MN is clearly the most difficult to implement, although it seems like a fairly small increment on the implementation of voyeurism, since all it requires is a more complex package name, instead of a new type of "with" clause. SCORE: Voyeurism: 0; MN: -2; Friends: -1; HL: 0 ----------------------------------------------------------------------- Simplicity: Certainly on the surface the voyeurism and friend concepts appear simplest. On the other hand, much less scrutiny has been applied to these feature than to HL, so it's a little early to make this claim. For instance, what is the meaning of "use X" after a "with private X". Are the private declarations of X directly visible, or must you do a "use private X" to achieve that effect, or is it impossible to achieve that effect. This is not a big deal, but it does need to be specified, and it does complicate the proposal somewhat. Another much more confusing issue is the meaning of the following situation: with private X; package Y is ... ; end Y: with private Y; with X; package Q is ... ; If Q can see Y's private declarations, but not X's declarations, but Y's private declarations refer to X's private declarations, then what exactly can Q see? Neither HL or MN have these difficulties. Finally, there is the issue of types whose definition is deferred until the package body. Such types are very useful in Ada 83, but they interfere with the ability to split up packages where they are declared. There is no possible way to accommodate such types with Voyeurism, whereas the HL proposal accommodates a straightforward generalization of the Ada 83 feature. There have been a number of comments to the affect that HL is complex. While it is true that solving certain problems with HL is more difficult than with the other proposals, this is not caused by the complexity of the proposal, but rather by the fact that hierarchy is not always the most natural way to compose two abstractions. The mapping document requires very few sentences (maybe five) to completely specify the HL proposal (excluding private packages.). If there is a perceived complexity it is most likely a presentation problem, since linguistically the HL seems the simplest of all the solutions presented. SCORE: Voyeurism: -1; MN: -1; Friends: -1; HL: 0. ----------------------------------------------------------------------- TOTAL SCORE : Voyeurism: -1; MN: -2; Friends: -3; HL: -1 If we take all these factors at equal value, then HL and voyeurism are about equal. All factors are not equal, however, and we think the damage to "privacy" caused by voyeurism is much more significant a problem than the lack of complete generality suffered by HL. Furthermore, as we shall see below, HL has other properties which are beneficial aside from the private visibility issue. EXAMPLES ========= We will start with the MATRIX/VECTOR problem. We would like to program a VECTOR abstraction, a MATRIX abstraction, and a third abstraction which provides operations that use both MATRIX and VECTOR types. We'll make the types private (otherwise the solution can be accomplished easily in Ada '83). ------------------------------------------------ Solution 1: (Voyeur/Friend packages) package VECTOR is type VECTOR is private; ... -- operations go here private (MATRIX_VECTOR) -- omit this for voyeurs type VECTOR is ... ; end VECTOR; package MATRIX is type MATRIX is private; ... -- operations go here private (MATRIX_VECTOR) -- omit this for voyeurs type MATRIX is ... ; end MATRIX; with private MATRIX; -- private and body see MATRIX private. with private VECTOR; -- private and body see VECTOR private. package MATRIX_VECTOR is -- operations go here end MATRIX_VECTOR; ------------------------------------------------- Solution 2: (HL/Combination is planned for/Extra visibility is OK) package VECTOR is type VECTOR is private ; -- operations private type VECTOR is ... ; end VECTOR; package VECTOR.MATRIX is -- private and body see VECTOR private type MATRIX is private; -- operations private type MATRIX is ...; end MATRIX; package VECTOR.MATRIX.COMBO is -- private and body see VECTOR and MATRIX. -- operations end VECTOR.MATRIX.COMBO; -- If you like then you can rename the above packages so that clients -- don't need to know the relationship. -------------------------------------------------- Solution 3: (HL Combination is unplanned for) package VECTOR is type VECTOR is private ; -- operations private type VECTOR is ... ; end VECTOR; package MATRIX is -- private and body see VECTOR private type MATRIX is private; -- operations private type MATRIX is ...; end MATRIX; package VECTOR.HELPER is -- operations needed to implement VECTOR_MATRIX end VECTOR.HELPER; package MATRIX.HELPER is -- operations needed to implement VECTOR_MATRIX end MATRIX.HELPER; with VECTOR; with MATRIX; package VECTOR_MATRIX is -- combination operations; end VECTOR_MATRIX; with VECTOR.HELPER; with MATRIX.HELPER; package body VECTOR_MATRIX is -- uses HELPER packages to implement the combination operations. end VECTOR_MATRIX; -------------------------------------------------- Solution 4: (MN) package VECTOR is type VECTOR is private ; -- operations private type VECTOR is ... ; end VECTOR; package MATRIX is -- private and body see VECTOR private type MATRIX is private; -- operations private type MATRIX is ...; end MATRIX; package VECTOR&MATRIX.COMBO is -- private and body see VECTOR and MATRIX. -- operations end VECTOR&MATRIX.COMBO; ------------------------------------------------------ The differences between these solutions illustrate the trade-offs involved in choosing one or the other approach. In solutions 1, 3 and 4 MATRIX and VECTOR are independent packages. (In this case, it's not clear that this is necessarily an advantage. One can easily imagine that MATRIX would want to know about VECTOR even if it didn't export the combination operations. In the abstract, however, it is an advantage.) In solution 2, MATRIX has visibility on the private declarations of VECTOR, which is not necessarily required. The combination packages have exactly the same visibility with either approach. Solution 1 uses an optional naming convention to make the connection between MATRIX, VECTOR, and MATRIX_VECTOR_COMBO clearly visible. The other solutions use a similar convention, enforced by the language. A client's use of the packages is slightly different. Solutions 1 and 3 require the client to "with" all three packages. The HL and MN proposals cause VECTOR and VECTOR.MATRIX to be "with"ed implicitly when the combination package is "with"ed, so only one "with" is required. A significant difference is the way that the set of packages can be extended. If a new MATRIX_1 abstraction is needed, and it also must be combined with VECTOR, then the friend solution requires the specification of VECTOR to be modified to list the new VECTOR_MATRIX_1 package as a friend. The first HL solution would have MATRIX_1 be a child of VECTOR, just as MATRIX is. The second would add a MATRIX_1.HELPER package, to be "with"ed by the new combined package. No change to the specification of VECTOR is required, so there is no reason to recompile the clients of VECTOR. The MN and voyeur solutions allow MATRIX_1 to be coded as an independent package, and the new combination can be constructed easily. Friend, Voyeur, and MN can deal with multiple VECTOR packages in the same way it deals with multiple MATRIX packages. The hierarchy imposed by solution 2 makes it impossible to add a new VECTOR_1 package, and have it work with the earlier MATRIX package. But the helper package solution has no such problems. ------------------------------------------------------------- The MATRIX/VECTOR problem, is an example of unrelated abstractions being combined, which is one instance of how monolithic packages arise in Ada 83. We will now present three other examples of systems where avoiding monolithic packages is considered desirable; POSIX-Ada, CAIS-A, and X-Windows. These examples will illustrate three points: 1) that programming in the large requires the ability to break up package specs 2) that package extension cannot always be planned for 3) that hierarchical libraries are sufficiently flexible to solve the several real-world problems. Point 1 implies that these solutions are legitimate candidates for incorporation in Ada 9X. Point 2 implies that the Friend approach is not sufficient to solve some important real-world problems. Point 3 demonstrates that the theoretical limitations inherent in the HL solution are not a problem in practice. CAIS-A: CAIS-A (MIL-STD-1838A) is an interface which provides operating system services in a portable way. The CAIS has a very large specification, and has hundreds of packages. The central type which is manipulated by much of the system is called NODE_TYPE. It is a limited private type defined in a NODE_DEFINITIONS package. The implementation uses unchecked_conversions inside packages that manipulate the type. There are a number of subsystems, Nodes, Attributes, and IO to name the 3 largest, which also have shared common types. These are currently implemented using visible types declared in support packages, which are only supposed to be withed by the packages in the subsystem. Since there is more than one type involved, it is possible that a hierarchical structure cannot provide the exact visibility needed; for instance, the entire IO sub-hierarchy must be nested within the NODE_TYPE hierarchy, since some of the IO packages require both. Nonetheless, a hierarchical system would go a long way towards making private types usable within the CAIS. The other proposals are just as useful, although having a naming convention which reflects the subsystem structure is beneficial, especially when there are hundreds of packages involved. POSIX-Ada: The Ada binding to POSIX is another system with many packages, and many types, some of which must be shared between packages. I will summarize the comments sent to us by Dave Emery about this issue. (He expressed a preference for Voyeur packages in that note.) The POSIX-Ada committee had trouble deciding on the nature of the File_Descriptor type (FD). In the end it became a visible integer type (they wanted to be ordered, and usable as an index), declared in the central POSIX package, where the common exceptions, and other common types are declared. This centralization of the type was able to significantly reduce the "with" structure of the packages. Given the location, it is apparent that the HL proposal could allow FD to be a private type outside of POSIX, but a visible integer type within POSIX. A second problem they encountered was preserving the principle of "with list portability". This is the property that allows a user to determine the portability of a program simply by examining the names of the withed packages. POSIX-Ada allows implementation specific extensions. The use of these makes a program unportable. Ideally these would be packages nested within the standard POSIX packages, but that approach violates the "with list portability" principle. The HL (or MN) approach is precisely what is needed to solve this problem. Voyeur and friend packages can also solve the problem provided some naming convention is used to distinguish the implementation dependent packages. The friend approach can solve the problem only if vendors may edit the standard packages to add the specification of the friend implementation extensions. X-Windows: The X-Windows binding is another example of a very large system, with many types. It is also interesting in that in an Ada 9X X-windows binding, type extension of private types is likely to be used quite heavily, both within the system, and by users implementing extensions to the window system. These users may not be able to recompile the source of the window system! Any solution to this problem must allow extension without recompilation. The Ada binding to X-windows uses public packages with private types, and "private" packages (only meant to be used by the implementation) with identical non-private types. Unchecked conversion is used to pass data between the public and private worlds. A helper package approach could be used here, possibly with private helper packages. Inheritance in X-windows is single-inheritance, so the simple HL proposal should generally provide all the required visibility when a user wants to extend an X-windows private types to create his own widgets. ------------------------------------------------------------- Hierarchical Name Space ======================= No one would seriously dispute that hierarchy is a useful mechanism for structuring large systems, so we will take that as a given. What then is the difference between a language supported hierarchical name space, and a naming convention which uses "_" to create hierarchical names? Our contention is that a language supported hierarchical name space is superior to an informal naming convention, even when the visibility of private data is not an issue. The differences between the two are subtle, but taken all together they are probably significant enough to matter. 1) "_" is often used instead of a space to create multi-word names. use of "_" becomes confusing when there is also a convention to use "_" as the hierarchical level indication. Using "." leaves no doubt as to how the hierarchy is to be structured. 2) Package names will become extremely long with either approach. Compilers are allowed to restrict the length of identifier names, so very large systems that use hierarchies are more likely to be portable if the long package names are composed of multiple identifiers, instead of just one. 3) With HL, the names of sibling packages, are directly visible. For Example: with OS.FILE_IO ; package OS.NETWORK_MNGR is -- Since we are nested within OS, the name FILE_IO is -- directly visible so the following is legal. F : FILE_IO.FILE_TYPE; ... ; Note that this automatic visibility of the children's names occurs in clients that "use" the parent package as well. Using existing mechanisms, the above must be done as follows: with OS; use OS; with OS_FILE_IO ; -- Is this one level of hierarchy or two? package OS_NETWORK_MNGR is package FILE_IO renames OS_FILE_IO; F : FILE_IO.FILE_TYPE; ... ; The latter is reasonable, but it does require a "use" clause which is often frowned upon. The advantages of the HL approach become even clearer as the nesting gets deeper, or a lot of sibling packages become involved. 4) Hierarchical libraries may encourage vendors to support program library tools which can manage lists of units as sub-hierarchies. Private Packages ================ As mentioned above, a complete solution to the monolithic package problem would allow package bodies to be segmented as well. Towards this end, the mapping team has proposed private child packages. Since private child packages cannot be "with"ed outside of their parents hierarchy, a set of them can implement the functionality of a complex package body, without allowing the interfaces to escape the implementation of that package. Private packages are a very simple extension to the basic HL proposal, furthermore they help simplify the semantics of the generalized deferred type extension proposal discussed below. Several vendors, including Rational and Intermetrics, already support similar functionality through their program library tools. More importantly, large systems are often coded as if private packages were available, using names like OS_PRIVATE_UTILITIES, and conventions that such packages are used to implement the system, but are not part of the exported interface of that system. The implementations of CAIS-A, and X-Windows both employ such packages. This argument cuts both ways. On the one hand, private child units are clearly useful, but on the other, the effect is currently being achieved without the language enforcing the matter. We are inclined to leave in the feature, since it only adds a small amount of complexity, and can make a significant difference towards building large systems with "small" compilers, since it allows for package bodies to be broken into small compilation units. Deferred Incomplete Types ========================= Ada 83 allows an incomplete type declared in the private part of a package to have its full type definition deferred until the body of that package. Such a type is allowed as the designated type of an access type. Deferring the type definition for abstract types which only export a "handle" on the real object, and for which no possible client (or extender) of a type can depend on the details of the implementation. Our semantic model is that in Ada 9X, the set of private child packages plus the package body make up what in Ada 83 would be the package body. For that model to hold while deferred incomplete types remain a useful feature, then the definition of the type must be allowed in the spec of a private child package where all the private children can see the definition. An alternative, which might appear at first to be a more useful feature, would allow the definition to be given in any child package (not just a private child.) Potentially, this provides a path for abstractions built from these types to be extended, but unfortunately it doesn't really achieve that goal, since it is incompatible with the way dispatching works with access types in Ada 9X. Concern has been raised over possible difficulties in implementing types whose definition is deferred to a private child package. An implementation must deal with these types at two points, when they are declared as incomplete types in the parent, and when they are (or are not) declared in the private child package. In the parent they are nearly identical with Ada 83 incomplete types. The primary difficulty will be getting the compiler to accept the child package name, which has not been declared previously. It will also be necessary to add information to the program library which will allow the linker to detect an error when the private child package is never provided. This is similar to the information added when a subunit stub is encountered. In Ada 83 a compiler must check that a deferred incomplete type is declared in the package body, and in Ada 9X an analogous check must be performed. Since the private child implicitly "withs" the parent, all the incomplete types which must be declared in the child are visible. Whatever mechanism the compiler uses to detect missing deferred incomplete types in package bodies may be used in the private child package spec.