Ada 9X LSN-044-DR Safety and Security Annex---Version April 23rd 1992 B A Wichmann, National Physical Laboratory, Teddington, Middlesex, TW11 OLW, UK E-mail: baw@seg.npl.co.uk and J McHugh Consultant E-mail: mchugh@cs.unc.edu April 23, 1992 1 Preamble This is the second draft of the annex. Significant changes are likely as the main core of the Ada 9X language stabilises. If you are in doubt as to the currency of the version of the annex you have, please consult the authors or the Ada 9X Mapping Team. Please bear in mind that this annex is designed to address the concerns of a small, specialist community (of 1% of the use of Ada?). The implicit assumption of this annex is that the significant burden that it places upon vendors is something that the specialist community is prepared to pay for. Achieving a balance here is not easy so vendors should be prepared for some costs and users need to temper their demands. Currently, this community's needs are satisfied by special contracts between large user organizations and a specific vendor. This approach can only satisfy the requirements of organizations with deep pockets. By making the needs part of the Ada 9X standard `off-the-shelf' products should be capable of satisfying the requirements at substantially reduced costs. This will allow Ada 9X to be used with assurance by different application areas, such as medical electronics or electronics funds transfer. 1 2 Safety and Security Annex 2.1 Scope This Annex addresses some of the specific requirements of those developing systems which are either safety critical or have security constraints. 2.2 Compliance This Annex specifies two levels of compliance; excluding certification or including certification. 2.3 Documentation of erroneous execution and bounded error situations Programs containing such situations will execute in a deterministic manner specified in documentation supplied by the vendor, except non-determinacy arising from (inappropriate) use of the following: ? concurrency; ? pragma INTERFACE; ? overlays via address clauses; ? hardware with access to storage. Note: This issue requires further study for two reasons: firstly, the situations will be listed in the core language and may be restricted there; secondly, users of this annex will probably require that such situations do not result in uncontrollable behaviour. A study note is available on Undefined Scalars [7]. 2.4 Reviewable object code An implementation will provide a mode of operation in which the object code generated by the compiler is available in a manner which allows independent validation and verification. Specifically, this mode of operation will provide: 1. the order of elaboration of library units; 2. the mechanism used for passing of each parameter for which the language does not require a particular mechanism; 2 3. the method used for the allocation and deallocation of storage; 4. the method used to evaluate numeric expressions if this involves an extended range or extra precision; 5. a method of obtaining the object code generated in machine-readable format (for subsequent analysis by other tools); 6. a method of determining which objects are assigned to which registers, and the lifetimes of those assignments; 7. a list of the actual machine instructions produced by the compilation of any given program. This list will include addressing modes or any other information necessary to test the correct functioning of the instructions; 8. code sequences derived entirely from one Ada statement must be indicated as such; 9. an indication of an Ada statement which results in no object code; 10. a method of determining which exception can be raised by any statement, and the location of the corresponding execution handler. Notes: 1. It is not intended that the use of this option will exclude most optimizations, although some will increase the cost of validation and verification of the object code. A vendor could decide to switch off some optimizations in this mode rather than provide the information to allow review of the object code. 2. These requirements could be met by suitable annotations on an assembler-like object code listing. However, a more useful presentation of the information would proably be via a vendor-supplied tool, perhaps based upon a debugger. 3. Users sometimes need to verify that a loop can be executed in a specific time interval. This can be verified by means of a (user-provided) tool which uses maximal instruction timings and the machine readable object code listing. 3 4. An industrially accepted architecture neutral instruction format could be used instead of a directly executable object code. 5. Since the RM allows for an indeterminate order for the evaluation of expressions, it is necessary to require additional documentation on the raising of exceptions. With this requirement, if a statement could raise either PROGRAMERROR or CONSTRAINT-ERROR, then the documentation will provide a method of finding out which one is raised. This could be by examination of the object code. Also, given that an exception may be raised, a method of determining the handler should be specified. By this means, it should be possible for a user to determine that every pre-defined exception that could be raised has a handler. 2.5 The certification option The vendor undertakes the following: 1. to use all reported bugs as the basis of a regression test; 2. to ensure that all compiler releases pass the regression tests (as well as the ACVC); 3. to notify the user if a new release contains (significant?) changes to the compiler (even if the change is not externally apparent); 4. to obtain an independent audit of its internal Quality Management System to ISO-9001 [4] or equivalent standard; 5. provide evidence of configuration management on the compiler if not already part of the Quality Management System. Independent validation and verification will be undertaken of a version of the run-time system. If this independent validation and verification is for a cut down run-time system that does not support the entire language, the subset will be defined by the vendor and enforced by the compiling system, in so far as the restrictions can be statically enforced. The vendor will define those dynamic restrictions which are not enforced. Notes: 1. A list is needed of independent validation and verification bodies. This includes FAA, ESA, NASA, ... (This list is not exhaustive, but illustrative.) 4 2. The wording above allows vendors to support only a well-defined subset of the language for the purpose of obtaining additional assurance of its quality. The subset would be vendor-specific and is not defined in this annex. References [1] Ada 9X Project Report --- Formal Studies of Ada 9X, Interim Report. Department of Defense. January 1992. [2] B A Carre and T J Jennings. SPARK --- The SPADE Ada Kernel. University of Southampton. March 1988. [3] J M Foster. TDF Specification. A purpose-built Architecture Neutral Distribution Format. RSRE. October 1990. [4] ISO 9001:1987, Quality systems --- Model for quality assurance in production and installation. [5] B Littlewood and L Strigini. Validation of Ultra-High Dependability for Software-based Systems. (to be published). [6] B A Wichmann. ``Low-Ada: An Ada validation tool''. NPL Report DITC 144/89. August 1989. [7] B A Wichmann. Undefined scalars in Ada 9X. 20th March 1992. LSN-038-DR. A Rationale Although everybody likes safety and security, only a few are prepared to pay for the costs that can be entailed in obtaining the assurance that is needed for some applications. Over the period that the 9X standard can be expected to be operative, the number of applications in this area could rise quite steeply. A.1 Scope Although only the safety and security areas are explicitly given in the scope, many other applications may require additional assurance that is provided by compliance to this annex. 5 A.2 Compliance Two levels of compliance are specified in this annex. From discussions with vendors and interested users, it seems that relatively few systems are likely to comply to these two levels. For every 100 validated Ada systems, perhaps 10 might comply with the non-certification option and five with the certification option. The cost to the vendor will be a limiting factor to the number of complying systems. For certification, an interested user will be required to shoulder part of the cost. The reason for this annex is to allow for the development of a market for high integrity Ada compilation systems. Obviously, such systems could be of benefit to users outside the safety and security areas. A.3 Documentation of erroneous execution and bounded error situations The requirement that such situations must lead to deterministic execution may require that the compiler generates different code. A study note on undefined scalars shows that existing Ada 83 compilers can sometimes fail to meet this requirement. It is not clear if the potentially damaging effects of an undefined scalar can be reduced by requirements placed upon all Ada 9X compilers (and hence for some requirements to be placed in the core language). Apart from undefined scalars, another form of bounded error occurs if parameters are aliased (see MR 6.4.1(7); 4.0). Since the reviewable object code facility requires that the actual parameter mechanism is defined, the execution is necessarily deterministic. Also, optimizations carried out by the compiler under the assumption that aliasing does not occur should also have a deterministic effect. The exclusions are required due to the technical difficulties. Fortunately, many actual applications can avoid the concurrent features of Ada or perform a detailed analysis to ensure the excluded features are used correctly. For instance, although unsynchronised update could make a scalar undefined, it should be straightforward to avoid this, and to avoid applying pragma VOLATILE to scalars without appropriate checks. A.4 Reviewable object code Due to the well-known fact that Ada (and other language) compilers have bugs, it is the conventional wisdom of the safety 6 critical community to avoid dependence on the compiler. For instance, the approach taken in the avionics standard DO-178B is one of Design, Review and Test. As far as practical, the review and test activities are undertaken at the object code level. Indeed, if the reliability requirements of class 1 flight critical software is to be attained, every attempt must be made to detect errors induced by a compiler. This is expensive, but unavoidable given current technology. In order to make review of the object code from an Ada compiler practical, some support is needed from the compiler vendor. The December 1991 mapping document proposed a pragma to invoke this assistance, but this has been revised into a `mode of operation' or compiler switch because this is thought to be more appropriate. Although the general concept of reviewable object code is clear enough, some specific requirements must be made in order to facilitate objective testing of Ada compilers and allow vendors to claim compliance to the annex. It must also be clear to the vendor how the requirements can be satisfied. Hence a number of specific requirements are given: 1. The order of elaboration of library units. The Ada 83 order has substantial flexibility leading to different behaviours for incorrect programs which will nevertheless be compiled. The situation in Ada 9X is not yet settled and must be assumed to be similar. Many existing Ada systems already provide the execution order in a convenient form. Merely providing the information via an assembler listing would not be sufficient since it would be hard to extract. The information is typically needed by static analysis tools to determine the absence (say) of accesses to undefined variables. 2. The mechanism used for passing of each parameter for which the language does not require a particular mechanism. Non-scalar parameters can be passed by reference or copy. A different mechanism could even be chosen for two calls of the same subprogram. Incorrect programs can be affected by the choice and hence safety and security applications need to check that either this error has not been made or that the effect will be acceptable. The simplest solution is for the compiler to indicate the choice made. If the choice is a simple static one (say, always by reference, except for entry parameters which are always by copy), then this could 7 be stated once in the compiler documentation, otherwise the object code listing (or appropriate tool) should indicate the choice. In fact, some compilers have a complex algorithm which varies from call to call, especially for slices. The complexity of this algorithm is not relevant to the issue, merely that the choices made should be clear. 3. The method used for the allocation and deallocation of storage. Many safety critical applications are in the form of an indefinite loop. It is important that this loop should not consume storage. Therefore it must be possible, by reviewing the object code, to ensure that this does not happen. In the security context, it is important that storage used to contain classified information does not leak via the storage allocation system. It is possible that this requirement can be met by proposals for a user-defined allocation and de-allocation of storage---in which case the run-time system may be less critical. Of course, even if there is no net consumption of storage within a program, and if the storage is not allocated and de-allocated via a stack, it will be necessary to show that storage fragmentation does not undermine the system. For this reason, the algorithm used by the run-time system are important and must be fully documented. For time-critical applications, additional constraints could be required on the run-time routines, such as having a tight upper bound in execution time. This requirement is not specified here, since it will be application specific and many systems avoid the use of the heap and hence are unlikely to have a problem. 4. The method used to evaluate numeric expressions if this involves an extended range or extra precision. Problems can arise due to the flexibility that Ada allows to compute with greater range or precision than that of the type involved. Counter-intuitive results can be produced with floating point expressions when combined with equality tests, which is quite common on IEEE systems. Hence the vendor should document the approach taken here, independent of any annotation of the object code, so that any potential confusion can be avoided. 8 5. A method of obtaining the object code generated in machine-readable format (for subsequent analysis by other tools). It cannot be assumed that the vendor will provide every tool needed for validation and verification. In any case, complete reliance upon such tools may not be acceptable. Hence it must be possible to extract the object code in a form that can be easily machine processed. For some architectures, there is an industrial standard for such object code which should be followed (assuming it is a useful form for machine processing and distinguishes instructions and data, etc). An example of a tool is one which provides maximal times for instruction sequences so that time-critical software can be shown to meet deadlines. 6. A method of determining which objects are assigned to which registers, and the lifetimes of those assignments. An important aspect of code generation is the assignment of registers. The most general register assignment algorithm is known to be NP complete and hence it is quite unreasonable for compiler documentation to detail such an algorithm (especially since it may be proprietary). However, the result of the algorithm for the specific safety/security application is to be provided. The instance given here is that of an Ada name. When the value of such a name or its address is assigned to a register, this assignment must be indicated. Obviously, the life-time of this assignment must be indicated also. Only registers are specified here since it is assumed that ordinary memory references to objects can be followed directly. Also, only objects are specified since they have names which can be used to annotate the information. Compilers typically produce code to address internal information as well as the information directly related to objects declared by the program. This issue is not specified here, since it is unclear how this can be specified in a testable form. 7. A list of machine instructions used for any possible program (not counting machine code insertions). For the most critical applications, it is necessary to check that the machine instructions required by an application 9 are correctly handled by the processor hardware. Microcode faults in processor chips are not uncommon and therefore such checking is needed. A list of the used instructions aids the checking since unused instructions need not be checked. It would be helpful to identify instructions only used in the run-time system, but this is not essential. 8. Code sequences derived entirely from one Ada statement must be indicated as such. In those cases in which a code sequence is derived from a single Ada statement, this statement must be identified. Due to some optimizations, it could be that this identification causes problems for the compiler vendor. In such cases, some optimizations could be removed for this mode of operation, rather than enhancing the compiler to meet the requirements with full optimization. In this area, a tool could be much more useful for independent validation and verification rather than an annotated listing. Some compilers provide information based upon line numbers rather than Ada statements. For the purposes of this annex, it can be assumed that there is only one statement per line. For a single statement, several code sequences could be involved, especially if instruction scheduling merges the code from more than one statement. Addressing instructions derived from more than one statement would not have to be identified as such. 9. An indication of an Ada statement which results in no object code. A positive indication of removal is required, rather than a mere absence of object code from a statement. It seems that such a removal would be a high-level, machine-independent optimization which should be easy to report. 10. A method of determining which exception can be raised by any statement, and the location of the corresponding execution handler. The use of the exception feature is problematic. Exception sites need not be in the same compilation as the handlers that service them. Some method is needed to indicate explicitly in the object code, the actual locations at which exceptions are raised and handled. Some mechanism should be available, either through source or object code analysis that permits the analysis of the program to 10 determine which the handler for each site that can raise an exception and to identify sites for which no handler is supplied. Since exceptions raised by predefined operations are not explicitly indicated in the source, and since the implementation is allowed some freedom in choosing actual execution order, this facility is best supported at the object code level. Even if a vendor does not choose to perform such an analysis, the information necessary to perform it should be made available to the user. For a detailed study of the issues involved, see Chapter 2 of [1]. Note that this is a case where the analysis of a standard intermediate form may not suffice since recognition of exceptions is closely tied to the instruction set. One would hope that improvement in the technology would allow the process of manual examination of object code to be minimised. However, it cannot be expected that the Ada 9X language will ever be precisely defined enough to allow a full Ada compiler to be `certified'. In consequence, the output from the compiler must be validated in some way. One method out of this dilemma is to adopt an intermediate language whose semantics can be precisely specified and yet be much less complex that the full Ada language. This is technically possible since much of the complexity of the Ada language is in the `front end' or static semantics and therefore need not be present in the intermediate language. For instance, overload resolution, generics and the visibility rules are examples of complex Ada features which would be completely absent in an intermediate language (nor are they present in conventional object code, of course). If the compilation of an intermediate language into machine language could be a relatively small step, then this compilation process may be certifiable, perhaps by use of mechanical formal verification. Review of the intermediate language object code would then be much simpler since the intermediate code would not be architecture specific and would not involve low level issues such as instruction scheduling. Two proposals have been made for an intermediate language to meet the objectives above. In the case of Low-Ada [6], the intermediate language is Ada-based, allowing the validation and verification of the object code to be rather straightforward. In the case of ANDF [3], the intermediate language has been proposed by OSF as a means of providing `shrink-wrapped' software on a variety of Posix platforms. The commercial backing 11 for ANDF could make the provision of Ada compilers via this route quite attractive, independent from any safety or security considerations. However, since ANDF is not Ada specific, the mapping from Ada to ANDF could be quite complex, similar to many object code mappings. Also, ANDF must be translated into machine code which is a non-trivial operation---requiring validation and verification itself. Nevertheless, if a `trusted' compiler for ANDF existed, then this route would be a significant advance. A.5 The certification option The conventional approach to the use of compilers in this area is to use a `mature' product. However, since keeping a compiler in an oak casket for seven years is not relevant, we must specify conditions which provide some objectivity to maturity. Unfortunately, wide use is in itself no guarantee of correctness or of the quality of the product. Also `wide use' as a criterion would not allow for effective competition, a hallmark of the Ada marketplace. The requirements here are rather unlike those of the main language and it could be argued that it is not of concern to this standard. However, since many bodies are restricted to using only standard specifications in procurement, there are advantages in placing these requirements in this Annex. The requirements in this section aims to provide confidence in the process used by the vendor in supporting the Ada system. Specifically: 1. To use all reported bugs as the basis of a regression test. Reported bugs must be properly handled. Of course, a report from a user may result in an analysis which shows that the compiler is not in error. Otherwise, a regression test is required so that it can be demonstrated that subsequent releases of the compiler do not have this fault. It is thought that most vendors already satisfy this requirement. 2. To ensure that all compiler releases pass the regression tests (as well as the ACVC). This requirement supplements those of the validation process and provides important additional assurance. Of course, this requirement would have little effect if it were not for the requirement to add tests to the test bucket. Few vendors would appear to have problems with this requirement. 12 3. To notify the user if a new release contains (significant?) changes to the compiler (even if the change is not externally apparent). The ideal for safety and security is a widely used compiler which has needed little maintenance over several years. Users can judge this provided they are informed of significant changes. For instance, a recoding of the optimizer must be regarded as a risk, even if the recoding is not supposed to change the users' view of the compiler. Given a significant change, a user could well decide not to upgrade to the latest release. It is not clear if all changes should be notified since otherwise it may not be clear what is `significant'. 4. To obtain an independent audit of its internal Quality Management System to ISO-9001 [4] or equivalent standard. It is widely recognized that quality cannot be obtained by testing alone but needs to be an objective of the design process. The ISO-9000 series of Quality Management Standards embodies this principle. In Europe, it is widely accepted that organizations can establish the quality of their work by independent auditing to the ISO Quality Management Standards. This process is not so widely accepted in North America and therefore some compiler vendors have had to have customers review their procedures in order that confidence in their quality assurance can be established. The appropriate ISO-9000 series standard for software products is ISO-9001 [4] which corresponds roughly to level 1 of the SEI Software Maturity Model. 5. Provide evidence of configuration management on the compiler if not already part of the Quality Management System. ISO-9001 requires that the quality procedures are documented to ensure that specific objectives are met. For companies producing products, the interpretation of the standard would usually imply a configuration management system. The item is added specifically to ensure that this vital requirement is met. It is clear that the Ada run-time system must be reviewed in the same way as the object code if a high level of assurance is to be attained. In the same way that no specific design requirements are made of the object code itself, no demands are made of the run-time system. However, very many critical 13 applications will be written in a subset of Ada which allows a much simplified, tailored run-time system. Such a system may well be certified by FAA (in the USA) or other organizations for use is specific application areas. Such a certification is expensive which currently precludes the development of such systems in areas where financial budgets are tighter (say medical electronics). The hope there is that by establishing certification of such Ada systems as Ada products, the certification route could lead to a wider use of Ada in other application areas. Of course, for such certified run-time systems that do not support the entire Ada language, the vendor must ensure the restrictions are documented and enforced by static checks wherever possible. In principle, any agency independent of the vendor could certify an Ada run-time system. However, the market would only readily accept as certification bodies those agencies which have an established reputation and need for the highest standards. These agencies will be contacted as part of the Ada 9X process to see to what extent mutual recognition would be possible to reduce certification costs and increase the market to the vendor. It does not seem necessary to propose any list of such bodies in the 9X standard, merely that the concept is acceptable to the specialist community. A.6 Items deleted from the December 1991 suggestion list The list of suggestions in version 4.0 of the Mapping Document has been revised significantly in this proposal as follows: ? G.5.7. Ensure use of a Validatable Compiler It appears that almost all vendors currently ensure that maintenance releases of their compilers will pass the ACVC tests (current at the time of the last validated release). Hence this issue is being raised separately with Ms Christine Anderson as a possible addition to the current procedures for validation. ? G.5.11. Storage Reuse This issue is awaiting clarification of the proposal to have a user-defined storage allocation facility in Ada 9X. Currently, this would be in the System Programming Annex, but in order to avoid cross-links between annexes, it would be better if this were in the core. 14 ? G.5.13. Allow User To Write Assertions There is no easy way for such a facility to be added to this annex, since pure functions is not part of the language. Moreover, it is clear that Ada itself is not an ideal language for assertions and hence this proposal has been dropped. However, there is concern that some of the requirements for high integrity need such a facility, in spite of the technical difficulties. One way round this dilemma would be to introduce a pragma whose effect is implementation-defined. Some advantages of this approach are: -- the pragma could be ignored; -- the pragma could be ignored by the compiler but processed by tools; -- the problem of side-effects and raising of exceptions in the pragma does not have to be addressed. ? G.5.14. Define Provable Subset The combination of a provable subset, proof tools, and a certified run-time system would be very useful for high integrity applications. However, there is no likelihood of agreement on a provable subset, since its usefulness depends upon tools with varying capabilities. In the UK, the SPARK Ada subset [2] has significant support, but it has been stated by the developers that the subset could be enhanced in various ways, given appropriate effort on the tools. Hence freezing a subset for five or more years would not be beneficial to the community. Also, if full Ada compilers can be shown to be reliable enough and can support this annex, then there will be no barrier to the use of the proof tools. Of course, proof of the correctness of the source code is of little use with an unreliable compiler---hence the need for compliance to this annex. It would be advantageous if reliability metrics could be established for Ada compilers to enhance the competition and provide the users with compilers of proven reliability characteristics. However, this seems to be a research topic [5] rather than one which can be the subject of ISO standardization. 15