ADA 9X LANGUAGE STUDY NOTE LSN-041-UI 21 Mar 92 "Runtime Implementation Strategies for Protected Records, Multi-way Entry Call, and Asynchronous Transfer of Control" Table Of Contents I. Considerations II. Differences from the Implementation Module Model III. Assessment of Difficulty IV. Prediction of User's Response V. Alternative Approaches VI. Comments On Simplification VII. Conclusions Appendix A. Passing Entries as Parameters I). Considerations The objective of this research was to evaluate the difficulty of implementing three run-time intensive language constructs proposed for Ada 9X. This evaluation was created by special request of the Ada 9x project office, in order to get a better understanding of the real impacts on implementors and users of these features. The RR Software Ada83 implementation is targeted to a widely used Unix platform and implements Ada tasking within the confines of a single Unix process. This imposes several constraints on the implementation and usefulness of these new constructs. Another characteristic of the Ada83 implementation is that it currently has a null range for PRIORITY and therefore task preemption does not occur. The 9X features of Protected Records (PRs), Multi-Way Entry Call (MWE), and Asynchronous Transfer of Control (ATC) were initially reviewed in light of these characteristics. For example, because preemption does not occur, ALL subprograms currently operate much like protected subprograms (assuming of course they follow the PR restriction of non-suspension). During the implementation study, it occurred to us that there is something ironic about implementing priorities (which result in preemption) in order to support protected records whose primary purpose is to prevent preemption. Also, because delay alternatives do not expire until a tasking supervisor call is made, an ATC with a delay alternative is not meaningful unless the abortable part suspends itself. Another case of a limitation is evident in the example given in MS4.0-9.9.3(6) where an ATC occurs when a keyboard interrupt occurs. The current implementation would suspend the entire program while waiting for a keyboard interrupt and therefore not produce the desired results of the example. In fact, if any of the tasks (including the main program) perform a blocking operation with the Unix operating system, the entire process (and thus all the tasks in the program) will suspend. For these reasons, the study was changed to assume the existence of a 9X release that would support preemption and asynchronous I/O (or at least asynchronous signals/interrupts). In addition, hard-real- time considerations were evaluated. The objective was to produce a design that would be acceptable for both candidate target environments. Although some thought was given to how the runtime would be changed to support preemption and asynchronous I/O on Unix, it was not investigated in detail. It is assumed that using POSIX threads for Ada tasks or allocating blocking I/O operations to separate "I/O" processes, along with using Signals and shared memory could be used to solve these problems. The main concentration was implementing the new 9X features, and not studying how to support additional Ada83 features to accommodate the new 9X features more fully. II). Differences from the Implementation Module Model We attempted to follow the implementation model described in the Implementation Modules, however there were a few areas where we diverged: A). The method suggested in the Ada 9X Mapping Specification for single processor implementations is the use of the priority ceiling protocol in implementing an efficient non-suspending mutual exclusion mechanism. Our implementation rejected this approach for three reasons: 1) If tasks are mapped to POSIX threads in the future, manipulation of priorities could be very costly. 2) For real-time embedded applications, the dependence on priorities is likely to interfere with dynamic scheduling algorithms that may wish to alter priorities radically. In other cases, applications may not want to use priorities at all, but have a fixed dispatch sequence. 3) It is not sufficient to extend to multiprocessor and distributed systems. Instead a processor-wide PROTECTED_LEVEL count is used to maintain the nest level within protected operations. While this value is nonzero, the tasking supervisor will defer any context switches (including most interrupts or remote PR actions). This eliminates the need for the Exclusion_Info records in the Implementation Modules. Still, we recognize the elegance of the priority based approach, and it may be appropriate for other runtimes. B). DATA STRUCTURES: Beyond what is specified in the Implementation Module model, our design added to or replaced some of the data structures defined there. Per-processor: (not per PR as in the IM) PROTECTED_LEVEL -- initially zero, contains PR nest levels DEFERRED_EVENT_FLAG -- initially false, set true -- when DEFERRED_EVENT_QUEUE contains events DEFERRED_EVENT_QUEUE -- initially empty, contains those events -- that arrive when PROTECTED_LEVEL > 0 ENTRY_SEQUENCE -- SEQUENCE # used for non-ATC calls Per Task: ATC_SEQUENCES -- list of sequence numbers for nested ATCs ATC_NEST_LEVEL -- current nest ATC nest level FREE_LIST -- holds preallocated entry queue elements (ADDITIONAL) QUEUE ELEMENT COMPONENTS: SEQUENCE -- active sequence # for this call ATC_LEVEL -- ATC_LEVEL at the time of the call ATC_FLAG -- indicates this call is select -- alternative for abort C). Dynamic Storage Allocation Within each task, a "free_list" is used to maintain a list of available queue elements. This list is grown (at a rate specified by a configuration parameter) at task creation time, and any time it becomes empty. It serves as a repository for deallocated queue elements so that the common system heap problems do not arise. For hard real-time systems, the configuration parameter can be set large enough so that all necessary queue elements are pre-allocated. For data processing systems where the number of outstanding entry calls is potentially very large, they are allocated as needed. Therefore when ever a queue element is needed, the free_list is checked first. If it is empty, a group of new queue elements are allocated from the system heap, and one is used (the rest going on the free_list). When entry calls are dequeued, they are always returned to the caller's free_list. D). CANCELLATION OF ENTRY CALLS A modification to the "CLAIM-BIT" in the select header mentioned in the Implementation Module is used to simultaneously cancel all outstanding entry calls at a particular ATC_LEVEL. To facilitate efficient cancellation of entry calls a simple approach is used. Essentially, each entry call queued is given a sequence number. When an entry call of any kind is cancelled or accepted, the task's relevant current sequence number is incremented. Prior to accepting an entry call (or COUNTing pending entries) each entry queue element is validated against the current sequence number of the caller. Stale queue elements are always dequeued by the server task and returned to the free list of the caller. To prevent "build-up" when calls are never accepted from an entry, the entry queue is searched to see if a stale call remains from the same caller. If so, the queue element is reused (no allocation necessary) and reset to current sequence numbers and conditions. To accommodate (potentially nested) ATCs, a linked list of ATC sequence numbers is allocated for each task. Typically ATCs are not expected to be nested deeply, but this allows for arbitrary nesting levels. To reduce overhead for non-ATC related entries, a simple sequence number is used whenever the ATC_LEVEL is zero(0). III) Assessment of Difficulty The early optimism in implementing these features faded more and more each day of the study. The interactions and large number of details combine to make the analysis of correctness extremely difficult. It is clear that significant changes to the runtime as well as the compiler would be required to support these features. The major concern is that the complexity of the runtime would increase, and therefore make it less reliable and more difficult to maintain. Using an approach of agent tasks in an attempt to simplify (by commonality) would most likely produce a result that is unacceptable to users. IV) Prediction of User's Response Since these requirements were submitted by the real-time community, (many of which require reliability as well as high performance,) the end reaction to these features must be evaluated. Will the anticipated users feel confident enough in the implementations to accept their usage? Users like Boeing still do not allow the use of rendezvous or even exception handlers! This is not for efficiency reasons, but because of reliability concerns. If the complexity of the implementation is considered high, the very users these features were meant to serve will reject it. If there is any doubt about their complexity, the number of objects (especially pointers) in the implementation model provides strong evidence of high complexity. V). Alternative Approaches Unfortunately, time was not available to perform a detailed study of alternatives to PR, MWE, and ATC as extra-lingual features. It seemed clear that a strict translation of the MWE and ATC features into a package interface is not practical because of the inability to pass entries as parameters of a subprogram. Likewise, one of the main benefits of the PR is to provide a "safe" encapsulation of a high-speed data synchronization mechanism. Placing this in a package destroys the protection of sequencing the locks properly. Rather than retaining the PR model, a simpler and well understood approach (as called out by STEELMAN) such as a counting semaphore might be more appropriate for an extra-lingual interface. Most real-time implementations support this now, and it could be standardized by putting it in an annex. Implementing MWE efficiently as a library package would be quite a trick. It could probably be replaced by some other broadcast/synchronization mechanism, but I see no clean way to use rendezvous operations from within a library package interface. ATC could be done as an implementation defined asynchronous exception. Although the implementation effort would still not be a piece of cake, the interaction with entry calls would be eliminated which would reduce the analysis effort. Something like 'Failure could be provided by an annex and the current exception mechanism provides a reasonable framework from a user's point of view. That is, users should be able to get the same functionality that they wanted from SELECT/THEN ABORT. Implementations could defer the exceptions during "critical" operations, and a separate facility for allowing the user to specify other regions (by calls to Defer_Failure_Notification and Enable_Failure_Notification) should be possible. VI). Comments On Simplification Three aspects of PRs 1) "requeue with abort": it seems odd that you can requeue with abort but can't queue with abort. In fact, it seems odd that you need the "with abort" at all - it should always be the case that prior to the entry call being accepted, it is eligible for cancellation. 2) The requirement to reevaluate barriers (that depend on 'COUNT) when a protected entry is cancelled seems a waste. Chances are, only increases in 'COUNT are going to open a barrier. For the very few situations where this isn't the case, force the user to call a dummy protected subprogram when such entries are cancelled. 3) The requirement tying read-only behavior to function syntax seems to be a mistake. Many useful functions which are candidates for redesigning as PRs (such as a random number generator) require changes to the PR data in order to work. Also, many run-times will not take advantage of the read-only behavior in any case. Therefore, enforcing that behavior will cost many compilers for no benefit. If this is considered important, a standard pragma can be defined in the real-time annex to specify read-only behavior. That way, compilers not supporting the real-time annex (the mostly likely annex not to be supported by an Ada compiler) will not have to pay those front-end costs. The MRT's main objection to this change (that functions only take IN parameters) is a complete red-herring. The implicit PR parameter to protected routines is essentially a access parameter (since all compilers would likely pass it that way. By defining it as one, we can eliminate the problem of modification. (This is the same way that C functions that modify parameter generally get mapped to Ada). This change has no semantic effects, other than the desired one. There doesn't appear to be any other obvious reductions to these features that will save a substantial amount of implementation effort (and complexity) and yet not severely impact potential uses of the feature. Some other simplifications were considered, particularly those Norman Cohen had suggested. Most of these other simplifications impact usage patterns in significant ways. For example, removal of the non-suspension rule could cause users of non-real time compilers to write non-portable code that takes advantage of the availability of suspension. However, simplifications like this ought to be considered if the political demand for simplification is more important than perfect portability of algorithms. One obvious change that should applied to protected records is to give them a better name. Given that they are a new kind of program unit (not a record, really), they should be called protected units or some such name, and all references to 'records' struck from their name and documentation. (This has the obvious side-effect of allowing the decision of whether or not to allow discriminants on them to be made on its own merits, and not by some supposed parallel to the mostly unrelated concept of records). VII). Conclusions All three of these features are complex. However, their utility to a wide range of users varies significantly. In the case of protected records, they are useful to a wide range of users. The need for data-oriented synchronization tends to occur in just about any multi-tasking program. In addition, the need to synchronize asynchronous events occurs in many application programs built on top of existing operating systems and user interfaces. In view of the wide range of application for protected records, a version with the changes recommended in the previous section should be retained in the language. For the other two features (MWE and ATC), the width of applications is narrower. In addition, concerns over reliability and the performance of these features is likely to keep a significant number of the potential users of these features from using them. Therefore, it unlikely that these features will have much actual utility in practice. Without that utility, they should not be included in the language. The combination of ATC and MWE with PRs makes all three of them slower and more complex. We do not believe it is prudent to include them all, as it is likely that ATC and MWE will make PRs significantly less likely to be implemented as efficiently as users would like. The key point about both of these features is that the problems they are intended to solve are complex. Attempting to sweep this complexity under the rug of the language may help a few users. but is going to make the language much more complex without a benefit to the majority of users. Ada system designers must realize that complex problems need complex solutions, and that getting the language to provide those solutions for them is not going to make them any less complex. Worse, the general case solution required in the language may be general enough to make the performance much worse in many particular problems, forcing users to roll-your-own solutions anyway. Whether these features should be standardized in the real-time annex is a difficult question, and one that the real-time community must answer. What should be considered for the core language is whether a SIMPLE feature or two could be defined to make this task easier. (See Appendix A for a sample of what we mean by this). Appendix A - Passing entries as parameters The analysis above notes that ATC and MWE cannot be implemented as package interface because of the inability to pass entries as parameters. It is conceivable that fixing this deficiency of Ada 9x would allow appropriate solutions to these and other problems during the lifetime of Ada 9x. Therefore, a consideration of this area is warrented. If solutions to passing entries as parameters turn out to be expensive and complex, then obviously this area is not likely to help. However, if these solutions are inexpensive and simple, then these solutions should be given serious consideration. It is interesting to note that such a solution would fit in well with the 'building block' approach that the MRT is advocating. With that introduction out of the way, let's look at possible ways to accomplish passing of entries as parameters. Two solutions stand out immediately: both are similar to existing Ada 9x mechanisms for handing procedures. One solution is to allow access to entry types. Such a mechanism would work similarly to access to subprogram types. An access to entry value could be used anywhere that a regular entry is. This solution would be very flexible. For our implementation of tasking, using this solution appears to be relatively painless. An access to entry type would be represented by the block of data necessary to make the appropriate supervisor calls. However, it is not hard to imagine that the implementation of this feature may be much harder for implementations that inline part or all of the entry call/rendezvous mechanism. The most likely problems would come from implementations that use different calling mechanisms for the different kinds of entry calls (normal, timed, and conditional). Even so, this feature looks less expensive than MWE. The other solution is allowing generic formal entries. This solution limits the user to static binding of entries. However, it is less likely to have an impact on implementations. Implementations that can use template expansion should have little impact, as the actual parameter could be known when the call was made. Obviously, an implementation could do better. Thus the major implementation problems could only happen to the universal generic sharers; one of them (us) does not have a problem. The benefits of entries as parameters can be applied to a wide variety of application requirements. For example, one of the examples used to justify multi-way entry calls was the ability to rendezvous with a group of server tasks in a nonspecific order to collect the results of a previously initiated operation. The Ada code for this was: NUM_RENDEZVOUS := 0; while NUM_RENDEZVOUS < NUM_SERVERS loop select T1.E1(IMAGE_SECTIONS(1)); or T2.E2(IMAGE_SECTIONS(2)); or T3.E3(IMAGE_SECTIONS(3)); ... ... ... end select; NUM_RENDEZVOUS := NUM_RENDEZVOUS + 1; end loop; The main reason for this is that the relationship between clients and servers is NOT symmetrical. That is, the server may have no knowledge of who the client is (but obviously the client must know the server.) Therefore, just by providing the "entry to call back" would solve this particular problem addressed by 9X multi-way calls. When the client initially provides work to the servers it also provides the entry to call when the work is completed. The client's "accept the results" loop now looks like: NUM_RENDEZVOUS := 0; while NUM_RENDEZVOUS < NUM_SERVERS loop accept Results(DATA : in IMAGE_BLOCK) do IMAGE_SECTIONS(DATA.SECTION_ID) := DATA; end Results; NUM_RENDEZVOUS := NUM_RENDEZVOUS + 1; end loop; Of course this second approach is much more efficient than the multi-way select used above, especially for large groups of server tasks. Another advantage of entries as parameters is that a wide variety of possibilities exist for packages that support rendezvous-like mechanisms. For example, a broadcast procedure could have a parameter that is an unconstrained array of access to entry types. This would be an explicit way to indicate that the order of acceptance is not important. Even asynchronous entry calls might be possible given a suitable generic package interface. One could hope that the availablity of a feature like this would eliminate the need for MWE and ATC completely, as well as providing a building block for other needed capabilities. An additional advantage is the elimination of a lot of complexity from the language and from runtime systems. A feature like access to entries may not be a perfect solution to making MWE and ATC available in a package form, because the profiles of the entries involved must be known at compile time. Perhaps there this is not a significant problem, or a workaround can be found. These are just quick thoughts, and detailed analysis would be needed if this approach is considered. But an oppurtunity to eliminate the complexity exists, rather than just moving it to a real-time annex.