ADA 9X MAPPING VOLUME II (NUMERIC ANNEX ONLY) SPECIFICATION AND RATIONALE Version 4.8 11 September 1992 IR-MA-1250-3.L Ada 9X Mapping/Revision Team Intermetrics, Inc. 733 Concord Avenue Cambridge, Massachusetts 02138 (617)661-1840 Published by Intermetrics, Inc. 733 Concord Avenue Cambridge, Massachusetts 02138 This is a draft document. Reprinting permitted if accompanied by this statement. Copyright (C) 1992 Intermetrics, Inc. This report has been produced under the sponsorship of the Ada 9X Project Office under contract F08635-90-C-0066. Instructions for Comment Submission Comments should be sent by 1 Nov 1992 via one of the following methods: U.S.Mail: Ada 9X Mapping/Revision Team Intermetrics, Inc. 733 Concord Avenue Cambridge, MA 02138 Phone: (617) 661-1840 FAX: (617)868-2843 Attn: Ada 9X Mapping/Revision Team E-Mail: ada9x-mrt@inmet.com Please note that the E-mail address for all correspondence to the Ada 9X Mapping/Revision team was changed. Comments should use the following format: !topic Title summarizing comment on Mapping Specification !reference MS-ss.ss(pp); 4.8 !from Author Name yy-mm-dd !keywords keywords related to topic !discussion where ss.ss is the section number of the document, pp is the paragraph number where applicable, and yy-mm-dd is the date the comment was sent. The date is optional if sent via e-mail. References to multiple sections of the document can be made by including additional !reference lines in the comment. As noted above the version of this document is 4.8. If possible please send comments by e-mail. Please send one message per section or comment, to facilitate cross-referencing. If your comments are lengthy and you cannot send them by e-mail, we would appreciate it if you could include a copy in machine-readable form, using either a Macintosh or IBM compatible disk format. All comments sent via E-mail should receive a confirmation message from the mapping team. If a confirmation is not received we ask that the sender try sending the message one more time. In the event that the second E-mail attempt is unsuccessful, the sender is then asked to call Intermetrics at the phone number noted above. Thank you for your help. Acknowledgements This document was prepared by the Ada 9X Mapping/Revision Team based at Intermetrics, Inc. The members of the team are: W. Carlson, Program Manager; T. Taft, Technical Director; R. Duff (Oak Tree Software); M. Edwards; C. Garrity; R. Hilliard; O. Pazy (consultant); D. Rosenfeld; L. Shafer; W. White. The following consultants to the Ada 9X Project have contributed to the Specialized Needs Annexes: T. Baker (Real-Time/Systems-Programming -- SEI, FSU); B. Brosgol (Information-Systems -- Consultant); K. Dritz (Numerics -- Argonne National Laboratory); A. Gargaro (Distribution -- Computer Sciences); J. Goodenough (Real-Time/Systems-Programming -- SEI); J. McHugh (Safety-Critical -- Consultant); B. Wichmann (Safety-Critical -- NPL: UK). This work is continuously being reviewed by the Ada 9X Distinguished Reviewers: E. Ploedereder, Chairman (Tartan); B. Bardin (Hughes); B. Brett (DEC); B. Brosgol (Consultant); N. Cohen (IBM); R. Dewar (NYU); G. Dismukes (Telesoft); A. Evans (Consultant); A. Gargaro (Computer Sciences); M. Gerhardt (ESL); J. Goodenough (SEI); T. Harrison (ISSI); S. Heilbrunner (University of Salzburg: Austria); P. Hilfinger (UC/Berkeley); J. Ichbiah (Consultant: DR emeritus); M. Kamrad II (Paramax Systems); P. Kruchten (Rational); R. Landwehr (CCI: Germany); C. Lester (Portsmouth Polytechnic: UK); D. Luckham (Stanford); L. Mansson (TELIA Research: Sweden); R. Mathis (Consultant); S. Michell (Multiprocessor Toolsmiths: Canada); M. Mills (US Air Force); D. Pogge (US Navy); K. Power (Boeing); O. Roubine (Verdix: France); W. Taylor (Consultant: UK); E. Vasilescu (Grumman). Other valuable feedback influencing the revision process is also continuously being received from the Ada 9X Language Precision Team (Odyssey Research Associates), the Ada 9X User/Implementor Teams (AETECH, Tartan, Telesoft), the Ada 9X Implementation Analysis Team (New York University) and the Ada community-at-large. The Ada 9X Project is sponsored by the Ada Joint Program Office. Christine M. Anderson at the Air Force Wright Laboratory, Armament Directorate, is the project manager. Table of Contents L. Numerics Annex (Specification) L-1 L.1. Floating-Point Arithmetic L-1 L.1.1. Floating-Point Machine Numbers L-1 L.1.2. Machine Number Attributes L-1 L.1.3. Floating-Point Model Numbers L-2 L.1.4. Model Number Attributes L-2 L.1.5. Accuracy of Floating-Point Operations L-2 L.1.6. Floating-Point Type Declarations L-3 L.2. Fixed-Point Arithmetic L-3 L.2.1. Fixed-Point Values and Attributes L-3 L.2.2. Fixed-Point Type Declarations L-3 L.2.3. Accuracy of Fixed-Point Operations L-3 L.2.4. Other Fixed-Point Attributes L-4 L.3. Elementary Functions L-4 L.4. Primitive Functions L-5 L.5. Complex Arithmetic L-5 L.6. Interface to Fortran L-6 L.7. Random Number Generators S-1 S. Rationale S-1 S.L. Numerics Annex (Rationale) S-1 S.L.1. Semantics of Floating-Point Arithmetic S-1 S.L.1.1. Floating-Point Machine Numbers S-1 S.L.1.2. Attributes of Floating-Point Machine Numbers S-1 S.L.1.3. Floating-Point Model Numbers S-2 S.L.1.4. Attributes of Floating-Point Model Numbers S-3 S.L.1.5. Accuracy of Floating-Point Operations S-4 S.L.1.6. Floating-Point Type Declarations S-5 S.L.1.7. Compatibility Considerations S-6 S.L.2. Semantics of Fixed-Point Arithmetic S-6 S.L.3. Elementary Functions S-8 S.L.4. Primitive Functions S-8 S.L.5. Complex Arithmetic S-8 S.L.6. Interface to Fortran R-1 References R-1 Index I-1 L. Numerics Annex (Specification) The semantic models of floating-point and fixed-point arithmetic, a generic package of elementary functions, and a collection of attributes comprising ``primitive'' floating-point manipulation functions are presented in this | annex. Ultimately, the generic package of elementary functions will be | included with other required packages in a chapter of the core devoted to an | ``Ada Standard Library,'' and the primitive function attributes will | likewise be in the core and required of all implementations. It is intended | that the models of floating-point and fixed-point arithmetic also be supported by all implementations, and their placement in an annex is purely for presentation purposes. A placeholder for an anticipated facility for the generation of | pseudo-random numbers, the details of which will be provided in the near | future, has been included in this annex. It is intended that the | to-be-proposed package or generic package be included in the Ada Standard | Library chapter, and therefore required of all implementations. | | Notwithstanding the above, the Numerics Annex does serve as a repository for | optional features that need be supported only by those implementations of | Ada intended to serve the special needs of numeric applications. These | features include a generic package defining a complex type and associated | operations, a generic package of complex elementary functions, and | facilities for interfacing with Fortran. | L.1. Floating-Point Arithmetic L.1.1. Floating-Point Machine Numbers Associated with each floating-point type is a finite set of machine numbers. The machine numbers of a type are those capable of being represented, to full accuracy, in the storage representation of the type. The machine numbers of a derived type are those of the parent type; the machine numbers of a subtype are those of its type. L.1.2. Machine Number Attributes Attributes related to the machine numbers of a floating-point type T (i.e., to its storage representation) are defined in this section. T'MACHINE_RADIX yields the radix of the hardware representation of T. T'MACHINE_MANTISSA yields the largest integer value of p, and T'MACHINE_EMIN and T'MACHINE_EMAX respectively the most negative and most positive integer values of exponent, such that every number expressible in the ``canonical form'' sign * mantissa * (radix ** exponent) where - sign = +1 or -1; - radix = T'MACHINE_RADIX; and - mantissa is a p-digit fraction in the number base radix, the first digit of which is nonzero is a machine number of T, i.e., representable to full accuracy in the storage representation of T. If, in addition, every number expressible in the canonical form, but where - sign = +1 or -1; - radix = T'MACHINE_RADIX; - exponent = T'MACHINE_EMIN; and - mantissa is a T'MACHINE_MANTISSA-digit nonzero fraction in the number base radix, the first digit of which is zero is also a machine number of T (called in IEEE Std. 754 a denormalized number), then the attribute T'DENORM yields the value TRUE; otherwise, it yields FALSE. T'DENORM is a new representation attribute of the type T. An implementation (i.e., one employing ``radix-complement'' representation) may furthermore include -T'MACHINE_RADIX ** T'MACHINE_EMAX and possibly T'MACHINE_RADIX ** (T'MACHINE_EMIN - 2) in the set of machine numbers of T; if so, they must be documented in Appendix F. Of course, zero is also a machine number of T. An implementation may have two distinct representations for floating-point zeros, with positive and negative sign respectively, having the properties given in IEEE Std. 754 or 854. The attribute T'SIGNED_ZEROS yields TRUE in this case, and FALSE otherwise. T'SIGNED_ZEROS is a new representation attribute of the type T. Note: Even if T'SIGNED_ZEROS is TRUE, the predefined equality operator yields TRUE given two operands of zero; this is a consequence of the IEEE standards cited above. Some of the elementary and primitive functions (see L.3 and L.4, respectively) yield results, given operands of zero, that depend on the value of T'SIGNED_ZEROS. The representation attributes T'MACHINE_ROUNDS and T'MACHINE_OVERFLOWS are retained. The meaning of T'MACHINE_OVERFLOWS is clarified (see L.1.5). The attributes T'MACHINE_RADIX, T'MACHINE_MANTISSA, T'MACHINE_EMIN, and T'MACHINE_EMAX return results of the type universal_integer. The attributes T'SIGNED_ZEROS and T'DENORM return results of type BOOLEAN. L.1.3. Floating-Point Model Numbers Associated with each floating-point type is an infinite set of model numbers. The model numbers of a type are used to define the accuracy requirements that must be satisfied by certain predefined operations of the type (see L.1.5); through certain attributes of the model numbers, they are also used to explain the meaning of a user-declared floating-point type declaration (see L.1.6). The model numbers of a derived type are those of the parent type; the model numbers of a subtype are those of the its type. The model numbers of a floating-point type T are zero and all the numbers expressible in the canonical form, where - sign = +1 or -1; - radix = T'MACHINE_RADIX; - exponent is an integer >= T'MODEL_EMIN; and - mantissa is a T'MODEL_MANTISSA-digit fraction in the number base radix, the first digit of which is nonzero. L.1.4. Model Number Attributes Attributes related to the model numbers of a floating-point type T are defined as follows. The attributes T'MODEL_MANTISSA and T'MODEL_EMIN used to define the model numbers, and the attribute T'MODEL_EMAX, are determined by the accuracy delivered by certain predefined operations of the type T and by their ability to avoid overflow. More precisely, T'MODEL_MANTISSA, T'MODEL_EMIN, and T'MODEL_EMAX yield, respectively, the largest integer <= T'MACHINE_MANTISSA, the most negative integer >= T'MACHINE_EMIN, and the most positive integer <= T'MACHINE_EMAX such that certain predefined operations of the type T satisfy the accuracy requirements given in L.1.5, expressed in terms of the model numbers of the type T and in terms of the attribute T'MODEL_LARGE, which is defined as follows: T'MODEL_LARGE = T'MACHINE_RADIX ** T'MODEL_EMAX * (1.0 - T'MACHINE_RADIX ** (-T'MODEL_MANTISSA)) Two additional attributes of the model numbers are defined for convenience, as follows: - T'MODEL_EPSILON = T'MACHINE_RADIX ** (1 - T'MODEL_MANTISSA). This attribute gives the absolute value of the difference between the model number 1.0 and the next higher model number of the type T. - T'MODEL_SMALL = T'MACHINE_RADIX ** (T'MODEL_EMIN - 1). This attribute gives the value of the smallest positive (nonzero) model number of the type T. The attributes T'MODEL_LARGE, T'MODEL_SMALL, and T'MODEL_EPSILON return results of the type universal_real. The attributes T'MODEL_MANTISSA, T'MODEL_EMAX, and T'MODEL_EMIN return results of the type universal_integer. For a user-declared floating-point type T, T'DIGITS returns the precision specified in the floating_accuracy_definition of T; the same value is returned for any type derived from T or any subtype of T. (In Ada 9X, a floating_accuracy_definition is not allowed in a subtype declaration.) For a predefined type P, the value of P'DIGITS is the largest value of D for which ceiling(D * log(10)/log(P'MACHINE_RADIX) + 1) <= P'MODEL_MANTISSA. The Ada 83 attributes T'MANTISSA, T'EMAX, T'LARGE, T'SMALL, T'EPSILON, T'SAFE_EMAX, T'SAFE_LARGE, and T'SAFE_SMALL are removed from the language, but for purposes of upward compatibility implementations are encouraged to retain them as implementation-defined attributes with the same values they had in Ada 83. L.1.5. Accuracy of Floating-Point Operations The accuracy requirements for the evaluation of certain predefined operations of floating-point types are stated as follows. Note: We present here a tentative version of the entire rewrite of RM 4.5.7 anticipated for Ada 9X. This section does not cover the accuracy of an operation of a static expression that involves only the operators of the root numeric types; such operations must be evaluated exactly (see 4.9). (Operators of the root_real type behave in other contexts like operators of a floating-point type whose model numbers have a precision and maximum exponent at least as great as, and a minimum exponent at least as small as, those of any other floating-point type declared in STANDARD; see 3.5.6.) It also does not cover the accuracy of the predefined attributes of a floating-point subtype that yield a value of the type; such operations also yield exact results (see L.4 and elsewhere). Finally, it should be noted that values outside the range T'FIRST .. T'LAST can be assigned to variables, passed to parameters, and returned from functions whose subtype T | is an unconstrained numeric subtype (because range checking is no longer | performed in those contexts when the subtype of the variable, formal | parameter, or function is an unconstrained numeric subtype), and that | fetching, in any context, the value denoted by a name or function_call whose | subtype T is an unconstrained numeric subtype can, but need not, raise | CONSTRAINT_ERROR when the value is outside the range T'FIRST .. T'LAST; thus no special provision is made in this section for the possible raising of CONSTRAINT_ERROR when the value denoted by a name or a function_call is used as the operand of a predefined operation. A model interval of a floating-point type is any interval whose bounds are model numbers of the type. The model interval of a type T associated with a value V is the smallest model interval of T that includes V. (The model interval associated with a model number of a type consists of that number only.) An operand interval is the model interval, of the type specified for the operand of an operation, associated with the value of the operand. If the absolute value of either bound of a model interval of T exceeds T'MODEL_LARGE, the model interval is said to be out of bounds; otherwise, it is said to be in bounds. For any predefined arithmetic operation that yields a result of a floating-point type T, the required bounds on the result are given by a model interval of T (called the ``result interval'') defined in terms of the operand values as follows: The result interval is the smallest model interval of T that includes the minimum and the maximum of all the values obtained by applying the (exact) mathematical operation to values arbitrarily selected from the respective operand intervals. The result interval of an exponentiation is obtained by applying the above rule to the sequence of multiplications defined by the exponent, assuming arbitrary association of the factors, and to the final division in the case of a negative exponent. The result interval of a conversion of a numeric value to a floating-point type T is the model interval of T associated with the operand value, except when the source expression has a fixed-point type with a small that is not a | power of T'MACHINE_RADIX or is a fixed-point multiplication or division | either of whose operands has a small that is not a power of T'MACHINE_RADIX; | in these cases, the result interval is implementation defined. Note: A conversion to a constrained subtype of a numeric type is a conversion to the type followed by a check that the result of the conversion belongs to the subtype, as in Ada 83. For any of the foregoing operations, the implementation must deliver a value that belongs to the result interval when the result interval is in bounds; otherwise (i.e., when the result interval is out of bounds), - if T'MACHINE_OVERFLOWS is TRUE, the implementation must either deliver a value that belongs to the result interval or raise CONSTRAINT_ERROR; - if T'MACHINE_OVERFLOWS is FALSE, the result is implementation defined. For any predefined relation on operands of a floating-point type T, the implementation may deliver any value (i.e., either TRUE or FALSE) obtained by applying the (exact) mathematical comparison to values arbitrarily chosen from the respective operand intervals. The result of a membership test is defined in terms of comparisons of the operand value with the lower and upper bounds of the given range or type mark (the usual rules apply to these comparisons). L.1.6. Floating-Point Type Declarations A floating-point type declaration of one of the two forms (that is, with or without the optional range specification indicated by the square brackets): type T is digits D [range L .. R]; is, by definition, equivalent to the following declarations: type floating_point_type is new P; subtype T is floating_point_type [range floating_point_type(L) .. floating_point_type(R)]; where floating_point_type is an anonymous type, and where P is a predefined floating-point type implicitly selected by the implementation so that it satisfies the following requirements: - P'DIGITS >= D. - If a range L .. R is specified, then P'MODEL_LARGE >= max(abs(L), abs(R)); otherwise, P'MODEL_LARGE >= 10.0 ** (4*D). The floating-point type declaration is illegal if none of the predefined floating-point types available for implicit selection as a parent type in a floating-point type definition satisfies these requirements. Note: Implementations may provide other predefined numeric types that are not available for implicit selection in a numeric type definition. The definition of the named number SYSTEM.MAX_DIGITS is changed slightly in Ada 9X. It now gives the maximum precision that can be requested in a | floating-point type declaration in the absence of a range specification. | Implementations may allow types with higher precisions to be declared, | provided that their declarations include range specifications. | L.2. Fixed-Point Arithmetic The language features for, and especially the model of, fixed-point arithmetic are simplified to facilitate their use and to foster wider implementation of the features. The concept of model numbers no longer applies to fixed-point types. A special kind of fixed-point type, called a decimal fixed-point type, or simply a decimal type, is introduced by the Information Systems Annex (see IS:DECIMAL_FIXED_POINT). Throughout this section, unqualified references to fixed-point types apply to all fixed-point types, whether decimal or not. Fixed-point types that are not decimal types are referred to, when necessary, as ``ordinary fixed-point types.'' L.2.1. Fixed-Point Values and Attributes The values of a fixed-point type are an infinite set of numbers, which are the integer multiples of the type's small. The values of a type derived from a fixed-point type are those of the parent type; the values of a subtype of a fixed-point type are those of its type that satisfy the subtype's range constraint. A fixed_accuracy_definition is no longer allowed in a subtype declaration. For a fixed-point type T, T'MACHINE_RADIX (which was allowed only for floating-point types in Ada 83) yields the radix of the hardware representation of T. For ordinary fixed-point types, this attribute always yields 2. For decimal types, it yields the value (which may be either 2 or 10) specified for the type in an attribute definition clause for MACHINE_RADIX; it is implementation defined, but restricted to the same choices, in the absence of such a clause (see IS:INTERNAL_DECIMAL_REP. (An attribute definition clause for MACHINE_RADIX is not allowed for ordinary fixed-point types.) The Ada 83 attributes T'MACHINE_ROUNDS and T'MACHINE_OVERFLOWS are retained; the meaning of the latter is clarified (see L.2.3). T'FORE and T'AFT are also retained. T'SMALL yields the absolute value of the difference between consecutive values of the type T; that is, it yields the value of the small of the type. If not specified in an attribute definition clause for SMALL, an ordinary fixed-point type's small is, by default, an implementation-defined power of two less than or equal to its delta. The small of a user-declared ordinary fixed-point type may be specified explicitly in an attribute definition clause; the value given must be less than or equal to the type's delta. The small of a decimal type (see IS:DECIMAL_FIXED_POINT is always the same as its delta and is not explicitly specifiable. Implementations are required to support binary smalls (smalls that are powers of two); implementations claiming conformance to the Information Systems Annex (see Section IS:ALL) are, in addition, required to support decimal smalls (smalls that are powers of ten). Implementations are allowed, but not required, to support other smalls. For an arbitrary fixed-point subtype T, T'SMALL = T'BASE'SMALL. For a user-declared fixed-point type T, T'DELTA returns the delta specified in the fixed_accuracy_definition of T; the same value is returned for any type derived from T and for any subtype of T. For a predefined fixed-point type P, the value of P'DELTA is the same as the value of P'SMALL. The Ada 83 attributes T'MANTISSA, T'LARGE, T'SAFE_LARGE, and T'SAFE_SMALL are removed from the language, but for purposes of upward compatibility implementations are encouraged to retain them as implementation-defined attributes with the same values they had in Ada 83. L.2.2. Fixed-Point Type Declarations An ordinary fixed-point type declaration type T is delta D range L .. R; [for T'SMALL use S;] where S (if specified) is less than or equal to D is, by definition, equivalent to the following declarations: type fixed_point_type is new P; subtype T is fixed_point_type range fixed_point_type(L) .. fixed_point_type(R); where fixed_point_type is an anonymous type, and where P is a predefined fixed-point type implicitly selected by the implementation so that it satisfies the following requirements: - if S is specified, then P'SMALL = S; otherwise, P'SMALL is an implementation-defined power of two less than or equal to D; - if abs(R) is a power of two times P'SMALL, P'LAST >= R - P'SMALL; otherwise, P'LAST >= R; - if abs(L) is a power of two times P'SMALL, P'FIRST <= L + P'SMALL; otherwise, P'FIRST <= L. The fixed-point type declaration is illegal if none of the predefined fixed-point types available for implicit selection as a parent type in a fixed-point type definition satisfies these requirements. Note: Implementations may provide other predefined numeric types that are not available for implicit selection in a numeric type definition. The range of the subtype T declared by the preceding fixed-point type declaration is determined as follows: - T'LAST = min(P'LAST, fixed_point_type(R)); | - T'FIRST = max(P'FIRST, fixed_point_type(L)). | The rules for the selection of the underlying predefined type used to represent a user-declared decimal type T1 are deducible from those applying to a particular ordinary fixed-point type T2 related to T1 (see IS:DECIMAL_FIXED_POINT. With the elimination of the model numbers for fixed-point types, the definition of the named number SYSTEM.MAX_MANTISSA must be revised slightly. Informally, this measure is related to the maximum ``normalized'' magnitude of any value of a fixed-point type or subtype (more precisely, to the number of bits required to hold the maximum normalized magnitude). An appropriate definition is the maximum value of ceiling(log2(max(abs(T'LAST), abs(T'FIRST)) / T'SMALL)) for any ordinary fixed-point type T. Also, the definition of the named number SYSTEM.FINE_DELTA is amended slightly to clarify that it applies only to ordinary fixed-point types. L.2.3. Accuracy of Fixed-Point Operations The accuracy requirements for the predefined fixed-point arithmetic operations and conversions, and the results of relations on fixed-point operands, are given below. This section does not cover the accuracy of an operation of a static expression that involves only the operators of the root numeric types; such operations must be evaluated exactly (see 4.9). As in Ada 83, the operands of the fixed-point adding operators, absolute value, and comparisons must have identical types. These operations are required to yield exact results, since no implementation difficulties are posed by this requirement. Overflow considerations are discussed later. Multiplications and divisions are allowed between operands of any two fixed-point types. Although this can be viewed as an operation that yields an infinitely precise result of a special type, followed by its conversion to the result type (see 4.5.5), for purposes of defining the accuracy rules we treat this instead as a single operation whose accuracy depends on three types (those of the operands and the result). In contrast to Ada 83, the result need not always be converted explicitly to some numeric type. Explicit conversion is not required when the surrounding context implies a unique type; implicit conversion takes place in that case. Explicit conversion is required when the context does not provide a unique result type. For decimal types, the attribute T'ROUND may be used to imply explicit conversion with rounding (see IS:ROUNDING_CONTROL). When the result type is a floating-point type, the accuracy is as given in | L.1.5. For some combinations of the operand and result types in the | remaining cases, the result is required to belong to a small set of values called the ``perfect result set''; for other combinations, it is required merely to belong to a generally larger and implementation-defined set of values called the ``close result set.'' When the result type is a decimal type, the perfect result set contains a single value; thus, operations on decimal types are always deterministic. When one operand of a fixed-point multiplication or division is of type universal_real, a case allowed in Ada 9X but not allowed in Ada 83 (see 4.5.5), that operand is not implicitly converted in the usual sense, since the context does not determine a unique target type, but the accuracy of the result of the multiplication or division (i.e., whether the result must belong to the perfect result set or merely the close result set) depends on the value of the operand of type universal_real and on the types of the other operand and of the result. We need not consider here the multiplication or division of two such operands, since in that case either the operation is evaluated exactly (i.e., it is an operation of a static expression all of whose operators are of a root numeric type) or it is considered to be an operation of a floating-point type (see 3.5.6). For a fixed-point multiplication or division whose (exact) mathematical result is V, and for the conversion of a value V to a fixed-point type, the ``perfect result set'' and ``close result set'' are defined as follows, | where T is the result type: | - If T is an ordinary fixed-point type with a small of S, | * if V is an integer multiple of S, then the perfect result set contains only the value V; * if T'MACHINE_ROUNDS is TRUE, the perfect result set contains | only the nearest integer multiple of S (or the two nearest | multiples, if V lies midway between two consecutive | multiples); | | * if T'MACHINE_ROUNDS is FALSE and V is positive, the perfect | result set contains the integer multiple of S in the | direction toward zero; | | * otherwise, it contains the integer multiple of S just below V | and the integer multiple of S just above V, | The close result set is an implementation-defined set of consecutive integer multiples of S containing the perfect result set as a subset. - If T is a decimal type with a small of S, | * if V is an integer multiple of S, then the perfect result set contains only the value V; * otherwise, if truncation applies then it contains only the integer multiple of S in the direction toward zero, whereas if rounding applies then it contains only the nearest integer multiple of S (with ties broken by rounding away from zero). The close result set is an implementation-defined set of consecutive integer multiples of S containing the perfect result set as a subset. Note: As a consequence of subsequent rules, this case does not arise when the operand types are also decimal types. - If T is an integer type, | * if V is an integer, then the perfect result set contains only the value V; * otherwise, it contains the integer nearest to the value V (if V lies equally distant from two consecutive integers, the perfect result set contains both). The close result set is an implementation-defined set of consecutive integers containing the perfect result set as a subset. The result of a fixed-point multiplication or division must belong either to the perfect result set or to the close result set, as described below, if overflow does not occur. (Overflow is discussed later.) In the following cases, if the result type is a fixed-point type, let S be its small; otherwise, i.e. when the result type is an integer type, let S be 1.0. - For a multiplication or division neither of whose operands is of type universal_real, let L and R be the smalls of the left and right operands. For a multiplication, if (L * R) / S is an integer or the reciprocal of an integer (the smalls are said to be ``compatible'' in this case), the result must belong to the perfect result set; otherwise, it belongs to the close result set. For a division, if L / (R * S) is an integer or the reciprocal of an integer (i.e., the smalls are compatible), the result must belong to the perfect result set; otherwise, it belongs to the close result set. Note: When the operand and result types are all decimal types, their smalls are necessarily compatible; the same is true when they are all ordinary fixed-point types with binary smalls. - For a multiplication or division having one universal_real operand with a value of V, note that it is always possible to factor V as an integer multiple of a ``compatible'' small, but the integer multiple may be ``too big.'' If the factorization allows an integer multiple less than some implementation-defined limit, the result must belong to the perfect result set; otherwise, it belongs to the close result set. A multiplication P * Q of an operand of a fixed-point type F by an operand of an integer type I, or vice-versa, and a division P / Q of an operand of a fixed-point type F by an operand of an integer type I, are also allowed, as in Ada 83. In these cases, the result has a type of F; explicit conversion of the result is never required. The accuracy required in these cases is the same as that required for a multiplication F(P * Q) or a division F(P / Q) obtained by interpreting the operand of the integer type to have a fixed-point type with a small of 1.0. The accuracy of the result of a conversion from an integer or fixed-point type to a fixed-point type, or from a fixed-point type to an integer type, is the same as that of a fixed-point multiplication of the source value by a fixed-point operand having a small of 1.0 and a value of 1.0, as given by the foregoing rules. The result of a conversion from a floating-point type to a fixed-point type must belong to the close result set. The result of a | conversion of a universal_real operand to a fixed-point type must belong to | the perfect result set. | The possibility of overflow in the result of a predefined arithmetic operation or conversion yielding a result of a fixed-point type T is analogous to that for floating-point types. If all of the permitted results belong to the range T'BASE'FIRST .. T'BASE'LAST, then the implementation must deliver one of the permitted results; otherwise, - if T'MACHINE_OVERFLOWS is TRUE, the implementation must either deliver one of the permitted results or raise CONSTRAINT_ERROR; - if T'MACHINE_OVERFLOWS is FALSE, the result is implementation defined. L.2.4. Other Fixed-Point Attributes Because the model of fixed-point arithmetic is no longer expressed in terms of model numbers and model intervals, no attributes related to the Ada 83 model are required (except T'DELTA, T'SMALL, T'FIRST, and T'LAST). In particular, the attributes T'MANTISSA, T'LARGE, T'SAFE_SMALL, and T'SAFE_LARGE (of a fixed-point type T) are eliminated and not replaced by other attributes. T'FORE and T'AFT are retained because of their connection with I/O. L.3. Elementary Functions Implementations shall provide in the core a predefined generic package | called GENERIC_ELEMENTARY_FUNCTIONS and an accompanying predefined package called ELEMENTARY_FUNCTIONS_EXCEPTIONS having the following specifications: package ELEMENTARY_FUNCTIONS_EXCEPTIONS is ARGUMENT_ERROR : exception; end ELEMENTARY_FUNCTIONS_EXCEPTIONS; with ELEMENTARY_FUNCTIONS_EXCEPTIONS; generic type FLOAT_TYPE is digits <>; package GENERIC_ELEMENTARY_FUNCTIONS is subtype FLOAT_BASE is FLOAT_TYPE'BASE; function SQRT (X : FLOAT_BASE) return FLOAT_BASE; function LOG (X : FLOAT_BASE) return FLOAT_BASE; function LOG (X, BASE : FLOAT_BASE) return FLOAT_BASE; function EXP (X : FLOAT_BASE) return FLOAT_BASE; function "**" (X, Y : FLOAT_BASE) return FLOAT_BASE; function SIN (X : FLOAT_BASE) return FLOAT_BASE; function SIN (X, CYCLE : FLOAT_BASE) return FLOAT_BASE; function COS (X : FLOAT_BASE) return FLOAT_BASE; function COS (X, CYCLE : FLOAT_BASE) return FLOAT_BASE; function TAN (X : FLOAT_BASE) return FLOAT_BASE; function TAN (X, CYCLE : FLOAT_BASE) return FLOAT_BASE; function COT (X : FLOAT_BASE) return FLOAT_BASE; function COT (X, CYCLE : FLOAT_BASE) return FLOAT_BASE; function ARCSIN (X : FLOAT_BASE) return FLOAT_BASE; function ARCSIN (X, CYCLE : FLOAT_BASE) return FLOAT_BASE; function ARCCOS (X : FLOAT_BASE) return FLOAT_BASE; function ARCCOS (X, CYCLE : FLOAT_BASE) return FLOAT_BASE; function ARCTAN (Y : FLOAT_BASE; X : FLOAT_BASE := 1.0) return FLOAT_BASE; function ARCTAN (Y : FLOAT_BASE; X : FLOAT_BASE := 1.0; CYCLE : FLOAT_BASE) return FLOAT_BASE; function ARCCOT (X : FLOAT_BASE; Y : FLOAT_BASE := 1.0) return FLOAT_BASE; function ARCCOT (X : FLOAT_BASE; Y : FLOAT_BASE := 1.0; CYCLE : FLOAT_BASE) return FLOAT_BASE; function SINH (X : FLOAT_BASE) return FLOAT_BASE; function COSH (X : FLOAT_BASE) return FLOAT_BASE; function TANH (X : FLOAT_BASE) return FLOAT_BASE; function COTH (X : FLOAT_BASE) return FLOAT_BASE; function ARCSINH (X : FLOAT_BASE) return FLOAT_BASE; function ARCCOSH (X : FLOAT_BASE) return FLOAT_BASE; function ARCTANH (X : FLOAT_BASE) return FLOAT_BASE; function ARCCOTH (X : FLOAT_BASE) return FLOAT_BASE; ARGUMENT_ERROR : exception renames ELEMENTARY_FUNCTIONS_EXCEPTIONS.ARGUMENT_ERROR; end GENERIC_ELEMENTARY_FUNCTIONS; The specifications above are identical to the proposed separate ISO standard (DIS 11430) for the elementary functions (for Ada 83) except that the formal | parameters and results of the elementary functions are of the unconstrained | (base) subtype of the generic formal type, rather than the actual subtype. | It is intended that implementations of GENERIC_ELEMENTARY_FUNCTIONS conform to the various semantic requirements (regarding domains, ranges, exception handling, accuracy, prescribed results, etc.) presented in DIS 11430 and not repeated here, except that implementations must allow | GENERIC_ELEMENTARY_FUNCTIONS to be instantiated with a range-constrained floating-point subtype, and the body must be immune to potential effects of the range constraint; in other words, implementations are not allowed to impose a restriction (allowed by DIS 11430) that the generic actual subtype | in an instantiation must be an unconstrained floating-point subtype; | In Ada 9X, the accuracy requirements are expressed in terms of | FLOAT_TYPE'MODEL_EPSILON, since MODEL_EPSILON of a subtype is now that of | its type. In DIS 11430, the accuracy requirements are expressed in terms of FLOAT_TYPE'BASE'EPSILON. The ARCTAN and ARCCOT functions must exploit signed zeros, if present in the implementation (as indicated by the value of FLOAT_TYPE'SIGNED_ZEROS). In particular, when X is negative and Y is zero: - if FLOAT_TYPE'SIGNED_ZEROS is TRUE, ARCTAN(Y, X, CYCLE) and ARCCOT(X, Y, CYCLE) must deliver -CYCLE/2.0 when Y is a negative zero and +CYCLE/2.0 when Y is a positive zero; - if FLOAT_TYPE'SIGNED_ZEROS is FALSE, ARCTAN(Y, X, CYCLE) and ARCCOT(X, Y, CYCLE) deliver CYCLE/2.0. The behavior of the versions of ARCTAN and ARCCOT without a CYCLE parameter is similar in the above case (i.e., when X is negative and Y is zero), except that the result is then an appropriate approximation of plus or minus pi. In addition, the zero delivered by SIN, ARCSIN, SINH, ARCSINH, TAN, TANH, and ARCTANH when X is zero must have the same sign as X when FLOAT_TYPE'SIGNED_ZEROS is TRUE; similarly, the zero delivered by ARCTAN when X is positive and Y is zero must have the same sign as Y when FLOAT_TYPE'SIGNED_ZEROS is TRUE. (This requirement goes beyond DIS 11430, which did not specify the sign of the result in these cases.) The extent of the exploitation of signed zeros is left implementation defined in the many other contexts in which an elementary function can return a zero result. L.4. Primitive Functions Implementations shall provide in the core the following additional | attributes: T'EXPONENT(X) T'FRACTION(X) T'DECOMPOSE(X, FRACTION, EXPONENT) | T'COMPOSE(FRACTION, EXPONENT) T'SCALE(X, ADJUSTMENT) | T'FLOOR(X) T'CEILING(X) T'ROUNDING(X) T'TRUNCATION(X) T'REMAINDER(X, Y) T'ADJACENT(X, TOWARDS) T'COPY_SIGN(VALUE, SIGN) T'LEADING_PART(X, RADIX_DIGITS) T'MIN(X, Y) T'MAX(X, Y) T'MODEL(X) T'MACHINE(X) In the case of MIN and MAX, the prefix may denote any scalar type or subtype; for the other attributes, the prefix must denote a floating-point type or subtype. Implementations conforming to the Numerics Annex shall also extend the attributes T'SUCC(X) and T'PRED(X) to apply when T is a floating-point type or subtype. All of the above attributes except MIN, MAX, MODEL, and MACHINE correspond directly to subprograms in the GENERIC_PRIMITIVE_FUNCTIONS generic package | proposed as a separate ISO standard (CD 11729) for Ada 83. The ROUNDING and TRUNCATION attributes correspond to the ROUND and TRUNCATE functions in GENERIC_PRIMITIVE_FUNCTIONS; the latter names are proposed in the Information Systems Annex for entirely different attributes (see | IS:ROUNDING_CONTROL). The SUCCESSOR and PREDECESSOR functions of | GENERIC_PRIMITIVE_FUNCTIONS are provided by the extension of the existing SUCC and PRED attributes. MIN, MAX, MODEL, and MACHINE are new (not taken | from GENERIC_PRIMITIVE_FUNCTIONS). The type of the result yielded by all of the ``primitive function'' attributes except EXPONENT and DECOMPOSE is the type containing T; EXPONENT | yields a result of type universal_integer, whereas DECOMPOSE is a procedure | and thus yields no value (except through its second and third parameters, | which are of mode out). The type of actual parameters corresponding to X, | Y, FRACTION, TOWARDS, VALUE, and SIGN must be the type containing T. Actual parameters corresponding to EXPONENT, ADJUSTMENT, and RADIX_DIGITS may be of | any integer type (i.e., the formal parameter has type universal_integer). | The second and third actual parameters of DECOMPOSE must follow the usual | rules for the association of actual parameters with formal parameters of | mode out. The value of an actual parameter corresponding to RADIX_DIGITS | must be positive. All the attributes that return values preserve | staticness. These attributes deliver results that are accurate to the level of machine numbers. Like T'FIRST and T'LAST, which also must deliver fully accurate results, they are not among the predefined operations covered by the replacement for RM 4.5.7 (see L.1.5). Note: A decision has not yet been made about whether extra accuracy can be passed in to a primitive function, and whether that implies that the extra accuracy must be maintained during the operation and must affect the result. It is anticipated that these attributes will be defined as were the corresponding subprograms in GENERIC_PRIMITIVE_FUNCTIONS, subject to | modifications when a decision is made on the role of extra precision. Their definitions are not repeated here. The definitions depend, in some cases, on the presence or absence of denormalized numbers and signed zeros, as reflected in the values of T'DENORM and T'SIGNED_ZEROS, respectively. The other attributes are defined below. T'MACHINE(X) returns the value of X rounded or truncated to a neighboring machine number (see L.1.1) of the type T; i.e., extra precision beyond T'MACHINE_MANTISSA radix digits is discarded, and CONSTRAINT_ERROR may be raised if the value of X is sufficiently outside the range T'BASE'FIRST .. T'BASE'LAST that rounding or truncating it to the precision of the machine numbers cannot yield a result in this range (i.e., cannot yield the appropriate bound of this range). T'MODEL(X) is defined as follows: - if X is a model number of the type T (see L.1.3) in the range -T'MODEL_LARGE .. T'MODEL_LARGE, X is returned; - if X lies between two consecutive model numbers of the type T in that range, one of those surrounding model numbers is returned; and - if X lies outside that range, CONSTRAINT_ERROR is raised. T'MIN(X, Y) and T'MAX(X, Y) return the minimum and the maximum of their two arguments, respectively. L.5. Complex Arithmetic | | Implementations conforming to the Numerics Annex shall provide a predefined | generic package called GENERIC_COMPLEX_TYPES having the following | specification: | | with ELEMENTARY_FUNCTIONS_EXCEPTIONS; --see L.3 | generic | type FLOAT_TYPE is digits <>; | package GENERIC_COMPLEX_TYPES is | subtype FLOAT_BASE is FLOAT_TYPE'BASE; | type COMPLEX is | record | RE, IM : FLOAT_BASE; | end record; | function REAL_PART (X : COMPLEX) return FLOAT_BASE; | function IMAG_PART (X : COMPLEX) return FLOAT_BASE; | procedure SET_REAL_PART (X : in out COMPLEX; | REAL_PART : in FLOAT_BASE); | procedure SET_IMAG_PART (X : in out COMPLEX; | IMAG_PART : in FLOAT_BASE); | function COMPOSE_FROM_CARTESIAN (REAL_PART : FLOAT_BASE) | return COMPLEX; | function COMPOSE_FROM_CARTESIAN (REAL_PART : FLOAT_BASE; | IMAG_PART : FLOAT_BASE) | return COMPLEX; | function MODULUS (X : COMPLEX) return FLOAT_BASE; | function "abs" (X : COMPLEX) return FLOAT_BASE | renames MODULUS; | function ARGUMENT (X : COMPLEX) return FLOAT_BASE; | function ARGUMENT (X : COMPLEX; | CYCLE : FLOAT_BASE) return FLOAT_BASE; | function COMPOSE_FROM_POLAR (MODULUS : FLOAT_BASE; | ARGUMENT : FLOAT_BASE) | return COMPLEX; | function COMPOSE_FROM_POLAR (MODULUS : FLOAT_BASE; | ARGUMENT : FLOAT_BASE; | CYCLE : FLOAT_BASE) | return COMPLEX; | function "+" (X : COMPLEX) return COMPLEX; | function "-" (X : COMPLEX) return COMPLEX; | function CONJUGATE (X : COMPLEX) return COMPLEX; | function "+" (X, Y : COMPLEX) return COMPLEX; | function "+" (X : FLOAT_BASE; | Y : COMPLEX) return COMPLEX; | function "+" (X : COMPLEX; | Y : FLOAT_BASE) return COMPLEX; | function "-" (X, Y : COMPLEX) return COMPLEX; | function "-" (X : FLOAT_BASE; | Y : COMPLEX) return COMPLEX; | function "-" (X : COMPLEX; | Y : FLOAT_BASE) return COMPLEX; | function "*" (X, Y : COMPLEX) return COMPLEX; | function "*" (X : FLOAT_BASE; | Y : COMPLEX) return COMPLEX; | function "*" (X : COMPLEX; | Y : FLOAT_BASE) return COMPLEX; | function "/" (X, Y : COMPLEX) return COMPLEX; | function "/" (X : FLOAT_BASE; | Y : COMPLEX) return COMPLEX; | function "/" (X : COMPLEX; | Y : FLOAT_BASE) return COMPLEX; | function "**" (X : COMPLEX; | N : INTEGER) return COMPLEX; | ARGUMENT_ERROR : exception | renames ELEMENTARY_FUNCTIONS_EXCEPTIONS.ARGUMENT_ERROR; | end GENERIC_COMPLEX_TYPES; | | The specification above is an adaptation of the standard under development | by the WG9 NRG (to be proposed as a separate ISO standard for Ada 83); it | differs in the following ways: | | - the components of the COMPLEX type (and some parameters and | results of the subprograms) are of the unconstrained (base) | subtype of the generic formal type, rather than the actual | subtype; | | - neither vectors and matrices of COMPLEX components nor operations | on them are included. | | It is intended that implementations of GENERIC_COMPLEX_TYPES conform to the | various semantic requirements (regarding domains, ranges, exception | handling, prescribed results, etc.) presented in the separate standard under | development and not repeated here, except that implementations must allow | GENERIC_COMPLEX_TYPES to be instantiated with a range-constrained | floating-point subtype, and the body must be immune to the potential effects | of the range constraint; in other words, implementations are not allowed to | impose a restriction (allowed by the separate standard under development) | that the generic actual subtype in an instantiation must be an unconstrained | floating-point subtype. | | Implementations conforming to the Numerics Annex are also required to | provide standard instantiations of GENERIC_COMPLEX_TYPES, i.e., | instantiations having standardized names, for all predefined floating-point | types, as follows: | | - an instantiation called COMPLEX_TYPES, with FLOAT_TYPE => FLOAT, | must always be provided; | | - if SHORT_FLOAT is defined in STANDARD, an instantiation called | SHORT_COMPLEX_TYPES, with FLOAT_TYPE => SHORT_FLOAT, must be | provided (note that UI-0048 requires that SHORT_FLOAT be used for | a type having 'SIZE near 32 bits); | | - if LONG_FLOAT is defined in STANDARD, an instantiation called | LONG_COMPLEX_TYPES, with FLOAT_TYPE => LONG_FLOAT, must be | provided (note that UI-0048 requires that LONG_FLOAT be used for a | type having 'SIZE near 64 bits); | | - instantiations having the obvious names must be provided for any | shorter or longer floating-point formats defined in STANDARD | (e.g., if LONG_LONG_FLOAT is defined there, then an instantiation | called LONG_LONG_COMPLEX_TYPES, with FLOAT_TYPE => | LONG_LONG_FLOAT, must be provided, etc.). | | In Ada 9X, the accuracy requirements are expressed in terms of | FLOAT_TYPE'MODEL_EPSILON, since MODEL_EPSILON of a subtype is now that of | its type. In the separate standard under development, the accuracy | requirements are expressed in terms of FLOAT_TYPE'BASE'EPSILON. | | The ARGUMENT function must exploit signed zeros, if present in the | implementation (as indicated by the value of FLOAT_TYPE'SIGNED_ZEROS). In | particular, when X.RE is negative and X.IM is zero: | | - if FLOAT_TYPE'SIGNED_ZEROS is TRUE, ARGUMENT(X, CYCLE) must | deliver -CYCLE/2.0 when X.IM is a negative zero and +CYCLE/2.0 | when X.IM is a positive zero; | | - if FLOAT_TYPE'SIGNED_ZEROS is FALSE, ARGUMENT(X, CYCLE) delivers | CYCLE/2.0. | | The behavior of the version of ARGUMENT without a CYCLE parameter is similar | in the above case (i.e., when X.RE is negative and X.IM is zero), except | that the result is then an appropriate approximation of plus or minus pi. | | In addition, when FLOAT_TYPE'SIGNED_ZEROS is TRUE, the zero delivered by | ARGUMENT when X.RE is positive and X.IM is zero must have the same sign as | X.IM; similarly, the zero imaginary part of the result delivered by | COMPOSE_FROM_POLAR(MODULUS, ARGUMENT, CYCLE) when ARGUMENT = +/-CYCLE/2.0, | or by COMPOSE_FROM_POLAR (with or without a CYCLE parameter) when ARGUMENT | is zero, must have the same sign as ARGUMENT when MODULUS is positive, and | the opposite sign when MODULUS is negative. (These requirements go beyond | the separate standard under development.) | | Implementations conforming to the Numerics Annex shall also provide a | predefined generic package called GENERIC_COMPLEX_ELEMENTARY_FUNCTIONS | having the following specification: | | with ELEMENTARY_FUNCTIONS_EXCEPTIONS; --see L.3 | with GENERIC_COMPLEX_TYPES; | generic | with package COMPLEX_TYPES is | new GENERIC_COMPLEX_TYPES(<>); | package GENERIC_COMPLEX_ELEMENTARY_FUNCTIONS is | use COMPLEX_TYPES; | function SQRT (X : COMPLEX) return COMPLEX; | function LOG (X : COMPLEX) return COMPLEX; | function EXP (X : COMPLEX) return COMPLEX; | function "**" (X, Y : COMPLEX) return COMPLEX; | function "**" (X : COMPLEX; | Y : FLOAT_BASE) return COMPLEX; | function "**" (X : FLOAT_BASE; | Y : COMPLEX) return COMPLEX; | function SIN (X : COMPLEX) return COMPLEX; | function COS (X : COMPLEX) return COMPLEX; | function TAN (X : COMPLEX) return COMPLEX; | function COT (X : COMPLEX) return COMPLEX; | function ARCSIN (X : COMPLEX) return COMPLEX; | function ARCCOS (X : COMPLEX) return COMPLEX; | function ARCTAN (X : COMPLEX) return COMPLEX; | function ARCCOT (X : COMPLEX) return COMPLEX; | function SINH (X : COMPLEX) return COMPLEX; | function COSH (X : COMPLEX) return COMPLEX; | function TANH (X : COMPLEX) return COMPLEX; | function COTH (X : COMPLEX) return COMPLEX; | function ARCSINH (X : COMPLEX) return COMPLEX; | function ARCCOSH (X : COMPLEX) return COMPLEX; | function ARCTANH (X : COMPLEX) return COMPLEX; | function ARCCOTH (X : COMPLEX) return COMPLEX; | ARGUMENT_ERROR : exception | renames ELEMENTARY_FUNCTIONS_EXCEPTIONS.ARGUMENT_ERROR; | end GENERIC_COMPLEX_ELEMENTARY_FUNCTIONS; | | The specification above is identical to the standard under development by | the WG9 NRG (to be proposed as a separate ISO standard for Ada 83), except | that instead of importing the complex type as a private type, together with | a minimally sufficient set of operations on the type, the generic package | imports an instantiation of GENERIC_COMPLEX_TYPES, which provides all the | necessary types and operations. | | It is intended that implementations of GENERIC_COMPLEX_TYPES conform to the | various semantic requirements (regarding domains, ranges, principal values, | branch cuts, exception handling, prescribed results, etc.) presented in the | separate standard under development and not repeated here. However, | implementations are not required to satisfy the accuracy requirements | contained therein, because those requirements are incomplete and currently | in a state of flux. Accuracy requirements will be included only if ongoing | research provides usable results in the appropriate time frame. The | treatment of signed zeros in GENERIC_COMPLEX_ELEMENTARY_FUNCTIONS is also | still under development. | | | | L.6. Interface to Fortran | | Implementations conforming to the Numerics Annex shall support a language | name of FORTRAN in the IMPORT, EXPORT, and LANGUAGE pragmas (see 13.9), if | Fortran (meaning either Fortran 77 or Fortran 90) is available in the same | environment. If more than one implementation of Fortran is available, and | they have different calling conventions or data layout conventions, then a | language name identifying the implementation shall be provided for each. | | In addition, Numerics Annex conforming implementations shall provide a | predefined package called FORTRAN_TYPES if Fortran is available in the same | environment. The package must at a minimum have the following contents: | | with GENERIC_COMPLEX_TYPES; -- see L.5 | package FORTRAN_TYPES is | type INTEGER is implementation-defined; | type REAL is implementation-defined; | type DOUBLE_PRECISION is implementation-defined; | package SINGLE_PRECISION_COMPLEX_TYPES is | new GENERIC_COMPLEX_TYPES (REAL); | type COMPLEX is | new SINGLE_PRECISION_COMPLEX_TYPES.COMPLEX; | type LOGICAL is implementation-defined; | subtype CHARACTER_SET is implementation-defined; | type CHARACTER is array (POSITIVE range <>) of | CHARACTER_SET; | pragma PACK(CHARACTER); | end FORTRAN_TYPES; | | The types INTEGER, REAL, DOUBLE_PRECISION, COMPLEX, LOGICAL, and CHARACTER | must have representations matching those of the default representations of | the corresponding Fortran scalar types. (Implementations may include | pragmas, as appropriate, to accomplish this.) CHARACTER_SET must be a | character subtype whose values correspond to the character set used by the | Fortran implementation for the default representation of character strings. | The package also must contain declarations of certain other types, as | discussed below; must provide a minimum set of conversion operations, as | discussed below; and optionally may provide other operations. | | If more than one implementation of Fortran is available in the environment, | a package analogous to FORTRAN_TYPES shall be provided for each, with the | package name identifying (in an implementation-defined manner) the specific | implementation of Fortran. Collectively, these packages are referred to | below as ``versions of FORTRAN_TYPES.'' | | A version of FORTRAN_TYPES for an implementation of Fortran that provides | types INTEGER*n, REAL*n, LOGICAL*n, and COMPLEX*n shall also define | corresponding types INTEGER_n, REAL_n, LOGICAL_n, and COMPLEX_n. (The | Fortran type CHARACTER*n corresponds to the Ada constrained array subtype | CHARACTER(1..n).) A version of FORTRAN_TYPES for an implementation of | Fortran 90 that provides multiple ``kinds'' of the intrinsic types, e.g. | INTEGER (KIND=n), REAL (KIND=n), LOGICAL (KIND=(n), COMPLEX (KIND=n), and | CHARACTER (KIND=n), shall also define corresponding types INTEGER_KIND_n, | REAL_KIND_n, LOGICAL_KIND_n, COMPLEX_KIND_n, and CHARACTER_KIND_n. | | At a minimum, FORTRAN_TYPES must support conversion, using either normal | type conversion or explicitly declared conversion functions, between: | | - INTEGER and an appropriate predefined integer type of Ada; | | - REAL and an appropriate predefined floating-point type of Ada; | | - DOUBLE_PRECISION and an appropriate predefined floating-point type | of Ada; | | - LOGICAL and the predefined BOOLEAN type of Ada; and | | - CHARACTER and the predefined STRING type of Ada. | | A user who requires an Ada complex type that is compatible with a complex | type defined by FORTRAN_TYPES should derive from the latter type (or use it | directly). | | To the extent possible, the types defined by FORTRAN_TYPES should be derived | from an appropriate Ada type. This provides for free not only the | conversion operations, but other useful operations as well. In some | implementations, LOGICAL may have to be derived from an integer type instead | of from BOOLEAN, and the conversion operations between LOGICAL and BOOLEAN | will then have to be provided explicitly. If explicit conversion functions | are provided, their names are implementation defined. | | | | L.7. Random Number Generators | | In the next revision, a facility for the generation of pseudo-random numbers | will be proposed in this section. The package or generic package to be | proposed will be a part of the ``Ada Standard Library'' and, as such, must | be provided by all implementations of Ada. | S. Rationale S.L. Numerics Annex (Rationale) This section provides the rationale for language features proposed in the Numerics Annex of the Mapping Specification. These language features include models of floating-point and fixed-point arithmetic, a predefined generic package of elementary functions, a predefined generic package | providing a complex type and associated operations, a predefined generic | package of complex elementary functions, facilities for interfacing to | Fortran, and attributes comprising a collection of ``primitive'' | floating-point manipulation functions. In addition, a placeholder for a | facility for the generation of pseudo-random numbers is included. The | models of floating-point and fixed-point arithmetic provide the semantics of operations of the real types and are placed in the Numerics Annex for presentation purposes only; all implementations must conform to the models. The elementary functions are currently described in the Numerics Annex but are relevant to such a wide variety of applications that they will be moved | to a chapter of the core on ``Ada Standard Libraries.'' The primitive functions were once thought to be germane only to the development of mathematical software libraries, but the realization that they have uses in the implementation of I/O formatting and other applications has resulted in a desire to move them to the core. The pseudo-random number facility also | serves a wide variety of applications, not all of them numerically | intensive, and it too will be described in the Ada Standard Libraries | chapter of the core. The features that need be provided only by | implementations conforming to the Numerics Annex include the generic package | of complex types and operations, the generic package of complex elementary | functions, and the Fortran interface facilities. | The treatment of numerics is simplified by - retaining, for floating-point types, only one of the concepts of model numbers and safe numbers; - eliminating, for fixed-point types, both the model numbers and the safe numbers; - eliminating attributes that are no longer needed. The simplification provides the following direct benefits: - real types become somewhat easier to describe; - at least one common misapprehension (that the safe numbers and model numbers of a floating-point type differ only in range, i.e., that they have the same precision) loses its basis; - fixed-point types become more intuitive, with no loss of functionality. Conceptually, for floating-point types it is the model numbers that are being eliminated and the safe numbers that are being kept. However, in the process of doing so, the properties of the latter are being changed slightly, and they are being called model numbers instead of safe numbers. One may prefer to think that the model numbers have been retained (with some changes) and the safe numbers eliminated, but the surviving concept is much closer to that of Ada 83's safe numbers. The name change is motivated by the broad connection of the resulting concept to the semantics of floating-point arithmetic, in contrast to the much more limited connotations of ``safe numbers.'' If, with the advent of Ada 9X, one only talks about ``model numbers'' in the context of their definition in Ada 9X, no confusion should arise. The changes in the surviving concept provide these secondary benefits: - the model of floating-point arithmetic becomes more useful to numerical analysts because, as a descriptive tool, it reflects the properties of the underlying hardware more closely; - the ``4*B Rule'' is recast in a way that does not penalize the properties of any predefined type; - implementations of floating point on decimal hardware become practical; - a few anomalies are eliminated. In general, the changes will have little impact on implementations; in particular, currently generated floating-point code should, in the main, remain valid. S.L.1. Semantics of Floating-Point Arithmetic Floating-point semantics in Ada 83 are tied to the concepts of model numbers and safe numbers. Effectively, the safe numbers define, for a given implementation, the accuracy required of the predefined arithmetic operators and the conditions under which overflow is and is not possible. Numerical analysts have used characteristics of the safe numbers to make claims about the actual performance of their programs in the underlying environment, and they have used attributes of the safe numbers to tailor the behavior of their programs to the numerical properties of the underlying environment. The model numbers, in contrast, can be said to represent the worst-case properties of the safe numbers over all conceivable conforming implementations, and therefore the worst acceptable numerical performance of an Ada 83 program. Numerical analysts have generally not exploited the model numbers or their attributes for any purpose, because they prefer to focus on the actual performance of a program in the underlying environment. The attributes of the safe numbers permit one to reason about that performance in a uniform, symbolic way over all implementations. Since the model numbers of Ada 83 have generally not been put to practical use, they are eliminated in Ada 9X. The concept of safe numbers as the determinant of the actual numeric quality in the underlying environment survives, but in their incarnation in Ada 9X the former Ada 83 safe numbers are called model numbers. At the same time, their definition has been modified slightly to allow them to fit more closely to the actual numeric characteristics of the environment, making them more useful for the purpose for which they were intended. In their new role, they correspond exactly to what Brown in [Brown 81], which is the basis of Ada's model of floating-point arithmetic, called model numbers. The changes in the floating-point model are in line with those addressed by Study Topic S11.1-B(1). S.L.1.1. Floating-Point Machine Numbers Ada 83 includes a characterization of the underlying machine representation of a floating-point type, based on an interpretation of the canonical form [RM 3.5.7(4)] in which the constraints on mantissa, radix, and exponent are those dictated by certain representation attributes [RM 13.7.3(5-9)]. This amounts to the definition of a set of numbers which we are calling, in Ada 9X, the machine numbers of a floating-point type. We define the machine numbers of a type T to be those capable of being represented to full accuracy in the storage representation of T. Some machines have ``extended registers'' with a wider range and greater precision than the corresponding (or, sometimes, any) storage format. Thus, in the course of computation with a type T, values having wider range or greater precision than the machine numbers of T can be generated, as allowed by the model of floating-point arithmetic. They can also be assigned to variables of T in Ada 9X (if T is an unconstrained floating-point subtype), since a variable | may be temporarily, or even permanently, held in a register. There is no guarantee, however, that such extended range or precision can be exploited, and consequently no attempt is made to characterize it. Note: In this connection, Ada 83 allows values of extended precision, but not extended range, to be assigned to variables. The benefits of keeping variables in extended registers are partially negated in Ada 83 by the need to perform range checks on assignment, even when the variable is of an | unconstrained floating-point subtype (having, therefore, an implementation- | defined range). These benefits are fully realizable in Ada 9X due to the fact that range checks are no longer performed on assignment to a variable of an unconstrained numeric subtype, on the passing of an argument to a | formal parameter of such a subtype, and on returning a value of such a | subtype from a function. Overflow may still be detected in such contexts, | i.e. when the expression on the right-hand side of an assignment statement, in an actual parameter, or in a return statement performs an operation whose result exceeds the hardware's overflow threshold, but that is a separate semantic issue. This is discussed further in S.L.1.5. Consideration was given to eliminating the characterization of machine numbers and retaining only that of the model numbers, thereby simplifying the discussion of floating-point matters even further. However, the characteristics of the machine numbers (that is, the storage representation of a floating-point type) are needed to define the meaning of certain attributes, viz. the ``primitive functions'' (see S.L.4), as well as the meaning of UNCHECKED_CONVERSION when its source type is a floating-point type. In addition, occasionally it is appropriate to design a numerical algorithm so as to exploit the characteristics of the machine representation as much as possible, even though in some contexts the hardware might not allow the full benefit of such an attempt to be achieved. S.L.1.2. Attributes of Floating-Point Machine Numbers The Ada 83 representation attributes of floating-point types (T'MACHINE_EMIN, T'MACHINE_EMAX, T'MACHINE_MANTISSA, T'MACHINE_RADIX, T'MACHINE_ROUNDS, and T'MACHINE_OVERFLOWS), which return values of type universal_integer, have been retained in Ada 9X, and two new Boolean-valued attributes (T'DENORM and T'SIGNED_ZEROS) have been defined. It has never been particularly clear whether and how the Ada 83 representation attributes accommodate denormalized numbers, if the implementation happens to have them. This situation is improved in Ada 9X in two ways. Implementations that generate and use denormalized numbers for a floating-point type T, as defined in [IEEE 85], will be distinguished by having T'DENORM = TRUE; otherwise, T'DENORM = FALSE. (Besides being useful to programmers, this new attribute plays a role in the definitions of some of the primitive-function attributes.) In addition, denormalized numbers are accommodated as machine numbers by clarifying the meaning of T'MACHINE_EMIN and relaxing the requirement that the leading digit of mantissa, in the canonical form of machine numbers, always be nonzero. The clarification is that T'MACHINE_EMIN gives the smallest value of exponent (in the canonical form) for which every combination of sign, exponent, and mantissa yields a machine number, i.e., a value capable of being represented to full accuracy in the storage representation of T. This effectively means that T'MACHINE_EMIN is the exponent of the smallest normalized machine number whose negation is also a machine number (which has relevance to implementations featuring ``radix-complement'' representation) and that, in implementations for which T'DENORM is TRUE, it is also the exponent of all of the denormalized numbers. A similar clarification for T'MACHINE_EMAX means that it is the exponent of the largest machine number whose negation is also a machine number; it is not the exponent of the most negative number on radix-complement machines. An alternative clarification of T'MACHINE_EMIN and T'MACHINE_EMAX was considered, namely, that they yield the minimum and maximum values of exponent for which some combination of sign, exponent, and mantissa yields a machine number. This would have allowed denormalized numbers to be accommodated without relaxing the requirement that the leading digit of mantissa be nonzero, and it would allow us to omit an observation, which we expect to include when we write the complete definition for the primitive function T'EXPONENT(X), as currently proposed, that this function can yield a result less than T'MACHINE_EMIN or greater than T'MACHINE_EMAX. Despite the apparent desirability of this alternative, it was judged to be too much of a departure from current practice and therefore too likely to cause compatibility problems. The new attribute T'SIGNED_ZEROS is provided to indicate whether the hardware distinguishes the sign of floating-point zeros, as described by [IEEE 85]. This attribute, along with the T'COPY_SIGN ``primitive function'' attribute, allows the numerical programmer to extend the treatment of signed zeros to the higher-level abstractions he or she creates, much in the manner of the elementary functions ARCTAN and ARCCOT (see S.L.3). It is expected that implementations that distinguish the sign of zeros will do so in a way consistent with relevant external standards (e.g., [IEEE 85]) to the extent that such standards apply to operations of Ada, and in appropriate and consistent (but implementation-defined) ways otherwise; thus, no attempt is made in Ada 9X to prescribe the sign of every possible zero result, or the behavior of every operation receiving an operand of zero. The two new attributes T'DENORM and T'SIGNED_ZEROS describe properties that an implementation may exhibit independently of any other support for IEEE arithmetic. Some implementations of Ada 83 do feature denormalized numbers and signed zeros (because they come for ``free'' with the hardware), but no other features of IEEE arithmetic. S.L.1.3. Floating-Point Model Numbers The primary changes that distinguish Ada 9X model numbers from Ada 83 safe numbers are these: 1. the length of the mantissa (in the canonical form) is no longer ``quantized,'' but is as large as possible consistent with satisfaction of the accuracy requirements; 2. the radix (in the canonical form) is no longer always two, but is the same (for a type T) as T'MACHINE_RADIX; 3. the model numbers form an infinite set; 4. the maximum non-overflowing exponent is no longer bounded below by a function of the mantissa length; 5. the minimum exponent is no longer required to be the negation of the maximum non-overflowing exponent, but is given (for a type T) by an independent attribute. The Ada 83 safe numbers have mantissa lengths that are a function of the DIGITS attribute of the underlying predefined type, giving them a quantized length chosen from the list (5, 8, 11, 15, 18, 21, 25, ...). Thus, on binary hardware having T'MACHINE_MANTISSA = 24, which is a common mantissa length of the single-precision floating-point hardware type, the last three bits of the machine representation exceed the precision of the safe numbers; as a consequence, even when the machine arithmetic is fully accurate (at the machine-number level), one cannot deduce that Ada arithmetic operations deliver full machine-number accuracy. With the first change enumerated above (freeing the mantissa length from quantization), tighter accuracy claims will be provable on many machines. As an additional consequence of this change, in Ada 9X the two types declared as follows type T1 is digits D; type T2 is digits D range T1'FIRST .. T1'LAST; are eligible to be implemented in terms of the same predefined type when | hardware characteristics do not require parameter penalties. This matches | one's intuition, since the declaration of T2 requests neither more precision | than that of T1 nor more range. In Ada 83, the chosen predefined types | almost always differ, with T2'BASE'DIGITS > T1'BASE'DIGITS, for reasons | having nothing to do with hardware considerations. (Note that this | artificial example is not intended to illustrate how one should declare two | different types with the same representation.) | The second change enumerated above (non-binary radix) has two effects: - it permits practical implementations on decimal hardware (which, though not currently of commercial significance for mainstream computers, is permitted by IEEE Std. 854 [IEEE 87]; is appealing for embedded computers in consumer electronics; and is used in at least one such application, an HP calculator); - on hexadecimal hardware, it allows more machine numbers to be classed as model numbers (and therefore to be proven to possess special properties, such as being exactly representable, contributing no error in certain arithmetic operations, etc.). As an example of the latter effect, note that T'LAST will become a model number on most hexadecimal machines. Also, on hexadecimal hardware, a 64-bit double-precision type having 14 hexadecimal (or 56 binary) digits in the hardware mantissa, as on many IBM machines, has safe numbers with a mantissa length of 51 binary bits in Ada 83, and thus no machine number of this type with more than 51 bits of significance is a safe number; in Ada 9X, such a type would have a mantissa length of 14 hexadecimal digits, with the consequence that every machine number with 53 bits of significance is now a model number, as are some with even more. (Why does the type under discussion not have Ada 83 safe numbers with 55 bits in the mantissa, the next possible quantized length and a length that is less than that of the machine mantissa? Because some machine numbers with 54 or 55 bits of significance do not yield exact results when divided by two and cannot therefore be safe numbers. This is a consequence of their hexadecimal normalization, and it gives rise to the phenomenon known as ``wobbling precision.'') The third change enumerated above (extending the model numbers to an infinite set) is intended to fill a gap in Ada 83 wherein the results of arithmetic operations are not formally defined when they exceed the modeled overflow threshold but an exception is not raised. Some of the reasons why this can happen are as follows: - the quantization of mantissa lengths may force the modeled overflow threshold to be lower than the actual hardware threshold; - arithmetic anomalies of one operation may require the attributes of model and safe numbers to be conservative, with the result that other operations exceed the minimum guaranteed performance; - the provision and use of extended registers in some machines moves the overflow threshold of the registers used to hold arithmetic results well away from that of the storage representation; - the positive and negative actual overflow thresholds may be different, as on radix-complement machines. The extension of the model numbers to an infinite range fills a similar gap in Ada 83 wherein no result is formally defined for an operation receiving an operand exceeding the modeled overflow threshold, when an exception was not raised during its prior computation. The change means, of course, that one can no longer say that the model numbers of a type are a subset of the machine numbers of the type; one may say instead that the model numbers of a type T in the range -T'MODEL_LARGE .. T'MODEL_LARGE are a subset of the machine numbers of T. The fourth change enumerated above (freeing the maximum exponent from dependence on the mantissa length) is equivalent to the dropping of the ``4*B Rule'' as it applies to the predefined types; a version of the rule still affects the implementation's implicit selection of an underlying representation for a user-declared floating-point type lacking a range specification, providing in that case a guaranteed range tied to the requested precision. The change in the application of the 4*B Rule allows all hardware representations to be accommodated as predefined types with attributes that accurately characterize their properties. Such types are available for implicit selection by the implementation when their properties are compatible with the precision and range requested by the user, but they remain unavailable for implicit selection in exactly those situations in which, in the absence of an explicit range specification, the 4*B Rule of Ada 83 acted to preclude their selection. Compatibility considerations related to the 4*B Rule are further discussed in S.L.1.7. The 4*B Rule was necessary in Ada 83 in order to define the model numbers of a type entirely as a function of a single parameter (the requested precision). By its nature, the rule potentially precludes the implementation of Ada in some (hypothetical) environments, as if to say that such environments are not suitable for the language or applications written in it; in other (actual) environments, it artificially penalizes the reported properties of some hardware types so strongly that they have only marginal utility as predefined types available for implicit selection and may end up being ignored by the vendor. Such matters are best left to the judgment of the marketplace and not dictated by the language. The particular minimum range required in Ada 83 (as a function of precision) is furthermore about twice that deemed minimally necessary for numeric applications [Brown 81]. Among current implementations of Ada, the only predefined types whose characteristics are affected by the relaxation of the 4*B Rule are DEC VAX D-format and IBM Extended Precision, both of which have a narrow exponent range in relation to their precision. In the case of VAX D-format, even though the hardware type provides the equivalent of 16 decimal digits of precision, its narrow exponent range requires that 'DIGITS for this type be severely penalized and reported as 9 in Ada 83; 'MANTISSA is similarly penalized and reported as 31, and the other model attributes follow suit. In Ada 9X, in contrast, this predefined type would have a 'DIGITS of 16, a 'MODEL_MANTISSA of 56, and other model attributes accurately reflecting the type's actual properties. A user-declared floating-point type requesting more than 9 digits of precision does not select D-format as the underlying representation in Ada 83, but instead selects H-format; in Ada 9X, it still cannot select D-format if it lacks a range specification (because of the analog of the 4*B Rule that has been built into the equivalence rule), but it can select D-format if it includes an explicit range specification with sufficiently small bounds. The compatibility issues associated with these changes are discussed in S.L.1.7. The IBM Extended Precision hardware type has an actual decimal precision of 32, but the 4*B Rule requires its 'DIGITS to be severely penalized and reported as 18, only three more than that of the double-precision type. Supporting this type allows an Ada 83 implementation to increase SYSTEM.MAX_DIGITS from 15 to 18, a marginal gain; perhaps this is the reason why it is rarely supported (it is supported by Alsys but not by the other vendors that have implementations for IBM System/370s). In Ada 9X, on the other hand, such an implementation can support Extended Precision with a 'DIGITS of 32, though SYSTEM.MAX_DIGITS must still be 18. Although a floating-point type declaration lacking a range specification cannot request more than 18 digits, those including an explicit range specification with sufficiently small bounds can do so and can thereby select Extended Precision. The fifth change enumerated above (separate attribute for the minimum exponent) removes another compromise made necessary by the desire, in Ada 83, to define the model numbers of a type in terms of a single parameter. The minimum exponent of the model or safe numbers of a type in Ada 83 is required to be the negation of the maximum exponent (thereby tying it to the precision implicitly). One consequence of this is that the maximum exponent may need to be reduced simply to avoid having the smallest positive safe number lie inside the implementation's actual underflow threshold; if it is needed, such a reduction provides another way to obtain values in excess of the modeled overflow threshold without raising an exception. Another is that the smallest positive safe number may have a value unnecessarily greater than the actual underflow threshold. With the fifth change, as with some of the others, more of the machine numbers will be recognized as numbers having special properties, i.e., as model numbers. Consideration was given to eliminating the model numbers and retaining only the machine numbers. While this would simplify the semantics of floating-point arithmetic further, it would not eliminate the interval orientation of the accuracy requirements (see L.1.5) if variations in rounding mode from one implementation to another and the use of extended registers are both to be tolerated. It would simply substitute the machine numbers and intervals of machine numbers for the model numbers and intervals of model numbers in those requirements, but their qualitative form would remain the same. However, rephrasing the accuracy requirements in terms of machine numbers and intervals thereof cannot be realistically considered, since many platforms on which Ada has been implemented and might be implemented in the future could not conform to such stringent requirements. If an implementation has appropriate characteristics, its model numbers up to the modeled overflow threshold will in fact coincide with its machine numbers, and an analysis of a program's behavior in terms of the model numbers will not only have the same qualitative form as it would have if the accuracy requirements were expressed in terms of machine numbers, but it will have the same quantitative implications as well. On the other hand, if an implementation lacks guard digits, employs radix-complement representation, or has genuine anomalies, its model numbers up to the modeled overflow threshold will be a subset of its machine numbers having less precision, a narrower exponent range, or both, and accuracy requirements expressed in the same qualitative form, albeit in terms of the machine numbers, would be unsatisfiable. S.L.1.4. Attributes of Floating-Point Model Numbers Although some of the attributes of model numbers in Ada 9X are closely related to those of the safe numbers in Ada 83, they all bear new names of the form T'MODEL_xxx. Certainly this is necessary for Ada 83's T'MANTISSA; the new version, T'MODEL_MANTISSA, is conceptually equivalent to T'BASE'MANTISSA in Ada 83 but is now interpreted as the number of radix-digits in the mantissa. Thus, at a minimum the value of this attribute will be roughly quartered on hexadecimal machines, even if there is no reason to take advantage of the other freedoms now permitted. A new name is certainly also necessary for T'SAFE_EMAX; the new version, T'MODEL_EMAX, is now interpreted as a power of the hardware radix, and not necessarily as a power of two. For hexadecimal machines, the value of this attribute will be quartered, all other things being equal. T'MODEL_EMIN is a new attribute. T'MODEL_LARGE is conceptually equivalent to Ada 83's T'SAFE_LARGE. It is defined in terms of more fundamental attributes, as was true of T'LARGE in Ada 83, with the result that the changes in the radix of the model numbers ``cancel out'' in the definition of this attribute; its value will change little, if at all, and then only to reflect the unquantization of mantissa lengths of model numbers. The same can be said about T'MODEL_SMALL, which is conceptually equivalent to Ada 83's T'SAFE_SMALL, and about T'MODEL_EPSILON, which is conceptually equivalent to Ada 83's T'BASE'EPSILON. The values of these attributes will be determined by how well the implementation can satisfy the accuracy requirements, with the primary determinant being the quality of the hardware's arithmetic. On ``clean'' machines, for which the model numbers up to the modeled overflow threshold coincide with the machine numbers, T'MODEL_MANTISSA, T'MODEL_EMAX, and T'MODEL_EMIN will yield the same values as T'MACHINE_MANTISSA, T'MACHINE_EMAX, and T'MACHINE_EMIN, respectively, though in general T'MODEL_MANTISSA and T'MODEL_EMAX may be smaller than their machine counterparts, and T'MODEL_EMIN may be larger. It is illuminating to contrast the processes by which the values of the model attributes of the predefined types are determined in Ada 83 and Ada 9X. For this purpose, we restate the process for Ada 9X first, then we present the similar Ada 83 process in an unconventional but comparable form. For a predefined type P in Ada 9X, the process is as follows: - Determine simultaneously the minimum and maximum exponents (EMIN and EMAX) and the maximum mantissa length (MMAX) for which the accuracy requirements, expressed in terms of the resulting set of model numbers, are satisfied. EMIN may be as small as P'MACHINE_EMIN, but hardware anomalies in the nature of premature underflow may cause it to be larger. EMAX may be as large as P'MACHINE_EMAX, but hardware anomalies in the nature of premature overflow may cause it to be smaller. MMAX may be as large as P'MACHINE_MANTISSA, but lack of guard digits, or hardware anomalies in the nature of inaccurate arithmetic, may cause it to be smaller. - Set P'MODEL_EMIN = EMIN. - Set P'MODEL_EMAX = EMAX. - Let DMAX be the maximum value of D for which ceiling(D * log(10)/log(P'MACHINE_RADIX) + 1) <= MMAX. - Set P'DIGITS = DMAX. - Set P'MODEL_MANTISSA = MMAX. - Set P'MODEL_EPSILON = P'MACHINE_RADIX ** (1 - P'MODEL_MANTISSA). - Set P'MODEL_SMALL = P'MACHINE_RADIX ** (P'MODEL_EMIN - 1). - Set P'MODEL_LARGE = P'MACHINE_RADIX ** P'MODEL_EMAX * (1.0 - P'MACHINE_RADIX ** (-P'MODEL_MANTISSA)). In comparable terms, the same process for Ada 83 may be stated as follows: - Determine simultaneously the minimum and maximum binary-equivalent exponents (EMIN and EMAX) and the maximum binary mantissa length (MMAX) for which the accuracy requirements, expressed in terms of the resulting set of model numbers, are satisfied. EMIN may be as small as P'MACHINE_EMIN * log(P'MACHINE_RADIX)/log(2), but hardware anomalies in the nature of premature underflow may cause it to be larger. EMAX may be as large as P'MACHINE_EMAX * log(P'MACHINE_RADIX)/log(2), but hardware anomalies in the nature of premature overflow may cause it to be smaller. MMAX may be as large as (P'MACHINE_MANTISSA - 1) * log(P'MACHINE_RADIX)/log(2) + 1, but lack of guard digits, or hardware anomalies in the nature of inaccurate arithmetic, may cause it to be smaller. - Set P'SAFE_EMAX = min(EMAX, -EMIN). - For each D, define a corresponding value of B as follows: B = ceiling(D * log(10)/log(P'MACHINE_RADIX) + 1). Let DMAX be the maximum value of D for which the corresponding value of B <= MMAX and for which 4*B <= P'SAFE_EMAX. Call the corresponding value of B BMAX. - Set P'DIGITS = DMAX. - Set P'MANTISSA = BMAX. - Set P'EMAX = 4 * P'MANTISSA. - Set P'EPSILON = 2.0 ** (1 - P'MANTISSA). - Set P'SMALL = 2.0 ** (-P'EMAX - 1). - Set P'LARGE = 2.0 ** P'EMAX * (1.0 - 2.0 ** (-P'MANTISSA)). - Set P'SAFE_SMALL = 2.0 ** (-P'SAFE_EMAX - 1). - Set P'SAFE_LARGE = 2.0 ** P'SAFE_EMAX * (1.0 - 2.0 ** (-P'MANTISSA)). Similar strictly comparable Ada 83 and Ada 9X statements of the equivalence rule, by which an implementation implicitly selects a predefined type to represent a user-declared floating-point type, are given in S.L.1.6. By examining those in conjunction with the attribute determination rules just given, one can see readily that the 4*B Rule of Ada 83 is entirely encapsulated in the attribute determination rules, while its analog in Ada 9X is entirely encapsulated in the equivalence rule. The set of attributes having both T'MODEL_xxx and T'MACHINE_xxx versions could conceivably be enlarged to add further intuitive strength and uniformity to the naming convention, but we have resisted adding attributes which meet no identifiable need. The attributes whose Ada 83 counterparts returned results of the type universal_integer continue to do so; these are T'MODEL_MANTISSA and T'MODEL_EMAX. The new attribute T'MODEL_EMIN also yields a value of this type. The attributes whose Ada 83 counterparts returned results of the type universal_real still do; these are T'MODEL_LARGE, T'MODEL_SMALL, and T'MODEL_EPSILON. Although there is no particular reason why T'MODEL_LARGE and T'MODEL_SMALL cannot return values of the type containing T, neither is there a compelling reason to make what would be a gratuitous change. It is our plan to establish a catalog of the fundamental model parameters for all known implementations of Ada at some future time. The renaming of the model attributes is intended to avoid one set of compatibility problems, wherein programs remain valid but change their effect as the result of changes in the values of attributes, but of course it introduces another: such programs become invalid. To avoid this, implementations are encouraged to continue to provide the obsolescent attributes, in fact with their Ada 83 values, as is discussed more fully in S.L.1.7. S.L.1.5. Accuracy of Floating-Point Operations The accuracy requirements for certain predefined operations (arithmetic operators, relational operators, and the basic operation of conversion, all of which are referred to in this section simply as ``predefined arithmetic operations'') of floating-point types other than root_real are still expressed in terms of model intervals, for reasons explained earlier. It is clarified that they do not apply to all such operations, however. For example, they do not apply to any attribute that yields a result of a specific floating-point type; such an attribute yields a machine number, which must be exact. The accuracy requirements for exponentiation are relaxed in accord with AI-00868. The weaker rules no longer require that exponentiation be implemented as repeated multiplication; special cases can be recognized and implemented more efficiently by, for example, repeated squaring, even when accuracy is sacrificed by doing so. The implementation model for exponentiation by a negative exponent continues to be exponentiation by its absolute value, followed by reciprocation. Thus, the rule continues to allow for the possibility of overflow in this case, despite the counterintuitive nature of such an overflow. The WG9 Numerics Rapporteur Group recommended this treatment, so as to allow for the most efficient implementations, recognizing, of course, that the user who is concerned with the possibility of overflow can express the desired computation differently and thereby avoid it. AI-00868 did not address the possibility of overflow in the intermediate results, for negative exponents. One of the goals for Ada 9X is to allow for and legitimize the typical kinds of optimizations that increase execution efficiency or numerical quality. One of these is the use, for the results of the predefined arithmetic operations of a type, of hardware representations having higher precision or greater range than those of the storage representation of the type. On some machines, this is not an option; arithmetic is performed in ``extended registers,'' there being no registers having exactly the precision or range of the storage cells used for variables of the type. Thus, we must allow the results of arithmetic operations to exceed the precision and range of the underlying type; avoiding that is intolerably expensive on some machines. A second common optimization is the retention of a variable's value in a register after its assignment, with subsequent references to the variable being fulfilled by using the register. This avoids load operations (i.e., the cost of memory references); it may, in many cases, even avoid the store into the storage location for the variable that would normally be generated for the assignment operation. One implication of legitimizing the use of extended registers is the need to define the result of an operation that could overflow but doesn't, as well as the result of a subsequent arithmetic operation that uses such a value. This is the motivation for the extension of the model numbers to an infinite range and for the rewrite of RM 4.5.7(7). The new rules describe behavior that is consistent with the assumption that an operation of a type T that successfully delivers or uses results beyond the modeled overflow threshold of the type T is actually performed by an operation corresponding to a type with higher precision and/or wider range than that of the type T, whose overflow threshold is not exceeded, and whose accuracy is no worse than that of the original operation of the type T. Ada 83 does not permit a value outside the range T'FIRST .. T'LAST to be propagated beyond the point where it is assigned to a variable of a type T, or is passed as an actual parameter to a formal parameter of type T, or is returned from a function of type T; a range check is required in these contexts to prevent it. Nothing prevents the carrying of excess precision beyond such a point, however. Thus, keeping a value in an extended register beyond such a point is permitted in Ada 83, whether or not it is also stored, provided that the range check is satisfied. The range check may be performed by a pair of comparisons of the source value to the bounds of the range, when those bounds are arbitrary; but in the case of an unconstrained | floating-point subtype, the check may be a free by-product of the store. | For example, on hardware conforming to IEEE arithmetic [IEEE 85], storing an extended register into a shorter storage format will signal an overflow if the source value exceeds the range of the destination format. If the propagation has no need for an actual store, because the value is to be propagated in the register, then a store into a throw-away temporary, just to see if overflow occurs, may be the cheapest way to perform the range check. If the check succeeds, all subsequent uses of the value in the extended register are valid and safe, including any potential need to store it into storage, such as when the value is about to be passed as an actual parameter and the implementation prefers to pass parameters in storage, or even merely because of the need for register spilling at an arbitrary place not connected with a use of the entity currently in the register. The loss of precision that occurs at that point does not matter, because it is consistent with the perturbations allowed when the value, had it not been shortened, is subsequently used as the operand of an operation. For an assignment to a variable of an unconstrained numeric subtype, actual | code to perform the range check is not always needed; it can be omitted if the implementation can deduce that the check must succeed. The generation of code to perform a range check is necessary only when extended registers are being used, and then only when the source expression is other than just a primary, that is, contains at the top level a predefined arithmetic operation that can give rise to a value outside the range. As an example, consider X := Y * Z; in which X, Y, and Z are assumed to be of some unconstrained floating-point | subtype T. If there are no parameter penalties, T'MODEL_LARGE = T'LAST = | -T'FIRST. If extended registers are not being used, then the multiplication cannot generate a value outside the range T'FIRST .. T'LAST (since the attempt to do so would overflow) and the range check can therefore be omitted. On the other hand, if extended registers are being used, a value exceeding T'MODEL_LARGE can be produced in the register, because the multiplication may no longer overflow, and a range check will be needed to preclude the propagation of a value outside T'FIRST .. T'LAST. When the source expression is simply the value of a variable, a formal parameter, or the result of a function call, as in X := Y; no actual range check is necessary, since the value (of Y, in the example) can be presumed to have passed an earlier range check in the first propagation away from the point where it was generated. When the source expression does contain a predefined arithmetic operation at the top level, the formal definition immediately precedes the range check on assignment by an overflow check (on the multiplication, in the first example above) that is at least as stringent (it is more stringent if there are parameter penalties causing T'MODEL_LARGE to be less than T'LAST). Because it is at least as stringent as the range check, the overflow check ought to subsume the range check, but in practice it does not since the actual overflow threshold, when extended registers are used, is even higher. It is unfortunate that the availability and use of extended registers sometimes require extra code to be generated for assignments in Ada 83. We discuss next a possible way to improve on this situation in Ada 9X. It is not what we have actually done, but it motivates the latter, which is described afterwards. We could proceed by making the range check optional at any propagation point (assignment statement, parameter passing, or return statement) when the target subtype is an unconstrained numeric subtype. This would allow an | out-of-range value to be propagated when no actual store is needed, and it would also permit an exception to be raised for an out-of-range value when the implementation does require that the propagation be performed by an actual store (for example, some implementations might never pass by-copy parameters or function results in registers). In general, it would permit an out-of-range value to survive in an extended register through an arbitrary number of propagations (in particular, those that don't require stores), only to give rise to an exception when a propagation point requiring a store is reached. We would also have to clarify that passing an actual parameter to an attribute that is a function, and returning a value from such an attribute, are considered propagations, since the attribute may be implemented in the same way as a function and may require its arguments or result to be passed in storage. Thus, a range check would optionally be performed at those places when the parameter or result is of an | unconstrained numeric subtype and the source value can be out of range. | Finally, we would also have to clarify that presenting an operand to a predefined arithmetic operation, and that returning a result from a predefined arithmetic operation, are also considered propagations, since some implementations may implement some such operations in the same way as function calls, requiring operands and results to be passed in storage. Thus, a range check would optionally be performed at those places, too, when the subtype of an operand or that of the operation's result is an | unconstrained numeric subtype. Actually, this goes a bit too far: there is | no need for the range check on the result of a predefined operation, since the more stringent overflow check already there subsumes it and accounts for any necessary raising of CONSTRAINT_ERROR at that point. That approach comes close to accomplishing what we need. The only problem with it is that it leaves untouched many contexts in which an out-of-range value in an extended register could be used without an opportunity for raising CONSTRAINT_ERROR, as might be required by the particular context. For example, simple variables used as the bounds of ranges, as discriminants, as subscripts, as specifiers of various sorts in declarations, as generic actual parameters, as case expressions, in delay statements, as choices in numerous constructs, and undoubtedly in other contexts would not be subject to a range check, because these are not propagation contexts. Implementations would have to be prepared to deal in these contexts with values having the range implied by the extended registers that are available, rather than the range implied by the | unconstrained subtype of the type associated with the context at hand. | What we have actually done, instead of including the optional range check in the semantics of propagation, is to include it in the semantics of read-references for certain categories of primary, specifically name and function_call. (This does not appear in the Mapping Specification for the Numerics Area, except in the form of a Note, since it is assumed to be a feature of the core.) The range check in the three main propagation contexts of Ada 83 (assignment statement, by-copy parameter passing, and return statement) is entirely eliminated when the target subtype is an | unconstrained numeric subtype. We shall now show that even in its absence | there is an opportunity to raise CONSTRAINT_ERROR in those propagation contexts, when the target subtype is an unconstrained numeric subtype and | the source value exceeds the subtype's range, and in all other contexts in | which such a value might be used. Indeed, the propagation contexts are just a subset of the general contexts, so they need not be considered separately. Every value used in a read (fetch) context, including those in propagation contexts, is denoted by the category expression or one of its descendants. Those expressions, or constituents of expressions, involving predefined arithmetic operations (including the implicit conversions inherent in references to numeric literals) already provide an opportunity to raise CONSTRAINT_ERROR when they yield a value of an unconstrained numeric subtype | that is outside the range of the subtype, because the operations perform an | overflow check on the result that, being more stringent than the desired range check, subsumes it. The opportunity to raise a CONSTRAINT_ERROR for a parenthesized expression or a qualified expression, as an expression or a component thereof, is provided by the evaluation of the expression that is its immediate component. This leaves only names and function calls. Therefore, all the necessary opportunities for raising the desired CONSTRAINT_ERROR are covered by adding an optional range check to the semantics of read-references for names and function calls (i.e., after the return), when the subtype denoted by the name or function call is an | unconstrained numeric subtype. Note that only names denoting non-static | objects are affected, since the evaluation of static expressions is both exact and not limited as to range. We stress that all of the new range checks we are introducing are optional; that is, either the check is optionally performed and raises CONSTRAINT_ERROR if it fails, or it is always performed and optionally raises CONSTRAINT_ERROR if it fails. Thus, the checks will not require any code to be generated unless an actual shortening (storing of an extended register) does need to occur at one of these places. Furthermore, as we indicated earlier, even then the range check may come for free (as on IEEE hardware). A simple example will illustrate what can be gained. Consider this typical inner product: SUM := 0.0; for I in A'RANGE loop SUM := SUM + A(I) * B(I); end loop; F(SUM); Assume that SUM is a variable of an unconstrained floating-point subtype. | We would like to keep SUM in an extended register during the loop, and in fact not even store the register into SUM during the loop. In Ada 83, we are formally obligated to perform a range check upon the assignment inside the loop, to prevent the propagation of a value outside SUM's range; in IEEE systems, the cheapest way to do this would be by storing into SUM after all, or into a throw-away temporary. Thus, a store (or some other means of checking) is executed each time through the loop. In Ada 9X, on the other hand, no range check is performed on that assignment, allowing an out-of-range value to be propagated, and justifying the complete omission of stores of the extended register containing SUM within the loop. The CONSTRAINT_ERROR that Ada 83 would have raised on some assignment in the loop might instead occur during the passing of SUM to F. It is allowed by the new optional range check in the semantics of the variable reference inherent in the parameter association, and whether or not it occurs there depends on whether parameters are passed in storage or in registers and, in the former case, whether the value of SUM is out of range at that point. With this change, it is true that exceptions can occur in places where they did not occur in Ada 83. However, whenever this happens, one can point to a different place in the program where the exception would have occurred, earlier, in its interpretation according to Ada 83 semantics. It may also be that the exception never occurs in the Ada 9X interpretation; in the example above, it may be that SUM remains forever in an extended register and is never stored, or it may be that its value has been brought back within range by the time it is stored. We should probably have noted much earlier that this treatment of | unconstrained numeric subtypes applies to all of them, not just | floating-point subtypes. It allows integer and fixed-point values to be | held in, for example, 32-bit general registers, in which integer arithmetic is performed, even when the storage format of the types involved has only 16 or 8 bits. Also, although omitting certain range checks appears to conflict with the safety goals of range checking, it must be remembered that the bounds of | unconstrained numeric subtypes are implementation dependent anyway, so that | whether a particular source value can be assigned to a target of an | unconstrained numeric subtype already depends (i.e., in Ada 83) on | properties of the implementation. Furthermore, all user-declared integer and fixed-point types necessarily involve range specifications and will therefore be subject to range checking; only floating-point types declared without a range specification will escape it. Of course, all predefined numeric types are unconstrained and will escape range checking (which is | consistent with their implementation-dependent ranges). Other than by using floating-point types declared without a range specification, or by using predefined types, or by going out of one's way to use T'BASE as a type mark, one will not escape range checking. As was explained above, even Ada 83 allowed and explained the loss of precision that can occur in shortening, when the propagation of a value held in an extended register requires it. Actually, there is one exception to this: if shortening is allowed to take place on a value being passed to an instantiation of UNCHECKED_CONVERSION (and it certainly seems that shortening is expected in that context), then nothing in the definition of UNCHECKED_CONVERSION, or anywhere else, currently allows or explains the shortening, in regard to the contrast between the possibly extra-precise value going in and the presumably shortened value coming out. In Ada 9X, the primitive functions (see S.L.4) introduce several additional contexts in which shortening can occur and yet the accompanying loss of precision is potentially unexplained. We need to introduce some rules that explain the possibility of loss of accuracy in those contexts where it is not currently explained. (The primitive functions, being attributes, are not operations to which 4.5.7 applies.) It seems likely that the latter will be specific to the contexts involved, though we have not yet resolved how best to accomplish that. The accuracy requirements for floating-point arithmetic operations are, for the time being, expressed separately from those for fixed-point operations, since the latter do not need the full generality of the interval-based model appropriate for floating-point operations (see S.L.2). Nevertheless, the rules may ultimately be recombined into a uniform set of rules for all real types for purely presentation purposes; if so, it would be made clear that some of the freedoms permitted in the floating-point case do not apply in the fixed-point case. S.L.1.6. Floating-Point Type Declarations The restatement of the ``equivalence rule'' for user-declared floating-point type declarations, which explains how an implementation selects a predefined floating-point type on which to base the representation of the declared type, is a natural extension of its form in Ada 83 that accommodates the changes described earlier. This rule is the basis for the intuitive (and informal) observation that floating-point types provide for approximate computations with real numbers so as to guarantee that the relative error of an operation that yields a result of type T is bounded by 10.0 ** (-T'DIGITS). [Note: A more formally complete version of this observation can be obtained from a theorem of [Brown 81].] The Ada 9X analog of Ada 83's 4*B Rule exerts its effect during the application of the ``equivalence rule,'' by which an implementation implicitly selects a predefined type to represent a user-declared floating-point type. Consider the declaration type T is digits D [range L .. R]; Restated (see L.1.6, the Ada 9X equivalence rule says that this is equivalent to type floating_point_type is new P; subtype T is floating_point_type [range floating_point_type(L) .. floating_point_type(R)]; where floating_point_type is an anonymous type, and where P is a predefined floating-point type implicitly selected by the implementation so that it satisfies the following requirements: - P'DIGITS >= D. - If a range L .. R is specified, then P'MODEL_LARGE >= max(abs(L), abs(R)); otherwise, P'MODEL_LARGE >= 10.0 ** (4*D). The effect of the analog of Ada 83's 4*B Rule, known in Ada 9X as the 4*D Rule, is to ensure that a user-declared type without a range specification provides adequate range; in fact, in all existing implementations of Ada, it precludes a predefined type from being selected if and only if the type would be precluded from selection by Ada 83's 4*B Rule. To see this, note that Ada 83's equivalence rule can be stated (unconventionally) in the same form, except that the conditions that the predefined type P must satisfy are as follows: - P'DIGITS >= D. - If a range L .. R is specified, then P'SAFE_LARGE >= max(abs(L), abs(R)). When the 4*D Rule precludes the selection of a type P in Ada 9X, it is necessarily the case that the value of P'DIGITS is penalized in Ada 83 by the 4*B Rule, and thus it is the first of the two conditions above that precludes the selection of P in Ada 83. The value of P'DIGITS is not penalized in Ada 9X. When a type declaration includes a range specification whose bounds are sufficiently small, the Ada 9X equivalence rule potentially permits the selection of a predefined type precluded (e.g., by the first condition) in Ada 83. However, among current implementations this occurs in only one instance (see S.L.1.7). The equivalence rule has been formulated in this way in Ada 9X to emphasize the role of the range specification in expressing the minimum range needed for computations with the type. An alternative was considered, in which the conditions for the selection of a predefined type P can be stated as follows: - P'DIGITS >= D. - P'MODEL_LARGE >= 10.0 ** (4*D). - If a range L .. R is specified, then in addition P'MODEL_LARGE >= max(abs(L), abs(R)). This alternative would result in complete equivalence between the Ada 9X selections and those of Ada 83 for all user-declared floating-point types, while still permitting hardware types that satisfy Ada 83's 4*B Rule only with a precision penalty to be supported without penalty, but such types could not always be selected when they provide adequate precision and range, relative to the precision and range requested. The alternative was rejected because it too severely restricts the utility of hardware types that can be supported as unpenalized predefined types in Ada 9X. The change in the interpretation of the named number SYSTEM.MAX_DIGITS is necessitated by the shift from the 4*B Rule to the new 4*D rule. This attribute is typically used to declare an unconstrained floating-point | subtype with maximum precision. The change in its interpretation ensures | that such a use will have the same effect in Ada 9X, i.e., will result in the selection of the same underlying representation as in Ada 83. One way in which our changes do not go quite as far as possible in reflecting the actual properties of the machine has to do with the use of T'MODEL_LARGE in describing when overflow can occur and when it cannot. On radix-complement machines, the negative overflow threshold does not coincide in magnitude with the positive overflow threshold, but this is not reflected in T'MODEL_LARGE, which is conservative (that is, it characterizes the less extreme threshold). While this is not of any particular consequence as far as the rewrite of 4.5.7 goes (after all, there are many reasons why a value exceeding T'MODEL_LARGE in magnitude might not overflow), it does interact in an undesirable way with the equivalence rule for floating-point type declarations. If a floating-point type declaration specifies a lower bound exactly coinciding with the most negative floating-point number of some | predefined type P, as can happen when P'FIRST is used as the lower bound of | the requested range, then P will be ineligible as the representation of the type being declared (on radix-complement machines, and even when no other arithmetic anomalies are present). This suggests that T'MODEL_LARGE ought to be abandoned in favor of two attributes, say T'MODEL_FIRST and T'MODEL_LAST, that characterize the positive and negative ``safe'' (i.e., overflow-free) limits separately. The way that T'MODEL_EMAX is defined would have to change; presumably it could be the maximum of the exponents of T'MODEL_FIRST and T'MODEL_LAST in the canonical form. A T'MODEL_FIRST and T'MODEL_LAST could then be defined for all numeric types (for integer and fixed-point types they would be equal to T'BASE'FIRST and T'BASE'LAST, respectively), and they could be used in a type-independent statement of when overflow can and cannot occur (allowing its removal from 4.5.7). This is an attractive idea and may be explored in the future. S.L.1.7. Compatibility Considerations In this section we analyze the impact of the potential sources of incompatibility resulting from the changes in the model of floating-point arithmetic. We argue that actual incompatibilities will arise rarely in practice, and that strategies for minimizing their effect are available. Actual incompatibilities have been reduced since the previous version of the Mapping Specification by the inclusion of the 4*D Rule (see S.L.1.6). The explicit use of model attributes of a floating-point type is rather rare and usually restricted to expertly crafted numeric applications. Thus, the elimination of some of the model attributes, in favor of new attributes with somewhat different definitions and new names, will probably not be noticed by the vast majority of existing Ada programs. We have already recommended (see NM:FLOADMODATTR and S.L.1.4) that vendors continue to support the obsolescent attributes as implementation-defined attributes, with their Ada 83 definitions, for the purpose of providing a smooth transition for those that are affected. Detected use of such attributes should evoke a warning message from the compiler, recommending that the references to obsolescent attributes be replaced by appropriate references to the new attributes, or by other appropriate expressions, when convenient. In most cases, the substitution is expected to be straightforward, but some analysis will be required to ascertain this. At least, by continuing to provide the obsolescent attributes as implementation-defined attributes, a vendor can provide continuity for programs affected by this change. A user-declared floating-point type declaration specifying an explicit range whose bounds are small in relation to the requested precision may select an underlying representation that, while providing the requested range, nevertheless provides less range than in Ada 83. (This is because the predefined type selected as the representation was required to satisfy the 4*B Rule in Ada 83 but is not required to do so in Ada 9X.) As a consequence, overflow in the computation of an intermediate result may occur where it did not previously. However, among current implementations this occurs only in DEC VAX implementations, when the requested precision exceeds 9 and the requested range has relatively small bounds, and when use of D-format, rather than G-format, for the LONG_FLOAT predefined type is explicitly enabled by the appropriate pragma. That is, in Ada 9X, D-format can be selected, whereas in Ada 83, D-format is precluded and H-format is selected. DEC VAX D-format is used only rarely and is being de-emphasized in newer systems. DEC VAX compilers that are affected by this change can issue a warning message when D-format is selected in a situation in which H-format would have been selected in Ada 83. The message can indicate that removing the range specification from the type declaration, and placing it instead on a subtype declaration, will (necessarily) result in the same selection for the underlying representation as in Ada 83. Alternatively, the compiler can avoid selecting D-format, even though it is allowed to. The language continues to express no preference for the selection of an underlying representation when multiple representations are eligible. Similar problems do not arise with IBM Extended Precision in the Alsys implementation for IBM 370. There is no larger type that is currently selected when the requested precision exceeds 18 decimal digits and the requested range has appropriately small bounds; thus, the selection of Extended Precision in such a case in Ada 9X represents a valid interpretation for what was previously an invalid program. In all other cases of which we are aware, the supported hardware types are such that a type providing the requested precision will always provide a range that satisfies Ada 83's 4*B Rule, resulting in no further incompatibilities. A user-declared floating-point type declaration not specifying an explicit range poses no compatibility problems, because the predefined type chosen to represent the declared type must satisfy the new 4*D Rule (see S.L.1.6), which in this case provides for compatibility with Ada 83. Among current implementations, the 4*D Rule will exert an effect only in DEC VAX and Alsys IBM 370 implementations; in the former, it will preclude the selection of D-format when the Ada 83 4*B Rule would have precluded it, and force the selection of H-format instead, whereas in the latter, it will preclude the selection of Extended Precision when it would not have been selected in Ada 83, and continue to make the program invalid. The 4*D Rule, newly added to the Mapping Specification (see L.1.6), should not, however, be viewed strictly as a concession to compatibility. Rather, it should properly be viewed as providing for uniformity among future implementations, by implying a minimum range in a context where no minimum range is specified. It coincidentally provides some additional compatibility when, without it, VAX D-format would be selectable. When VAX D-format or IBM Extended Precision is selected in contexts when it would also have been selected in Ada 83 to represent a type T, the value of T'BASE'DIGITS, which was 9 for the former and 18 for the latter in Ada 83, will now be 16 or 32, respectively. If this attribute is correctly used, i.e. to tailor a computation to the actual precision provided by the underlying type, then the computation should adapt itself naturally to the new value. Nevertheless, in the few circumstances in which use of this attribute is detected by an affected compiler, a warning message can be issued. The very few remaining situations in which a different underlying representation is selected in Ada 9X (for example, the one illustrated in S.L.1.3) are considered true anomalies genuinely worth correcting. In any case, they have rather artificial characteristics and are thus extremely unlikely to occur in practice. S.L.2. Semantics of Fixed-Point Arithmetic Various problems have been identified with fixed-point types in Ada 83: - They can be counterintuitive. The values of a fixed-point type are not always integer multiples of the declared delta (they are instead integer multiples of the small, which may be specified or defaulted, and which in either case need not be the same as, or even a submultiple of, the delta), and they do not always exhaust the declared range, even when the bounds of the declared range are integer multiples of the small (we are thinking about the case where a bound of the range is a power of two times the small). These surprises are responsible for some of the confusion with fixed-point types (although some programmers do understand and correctly exploit the fact that the high bound need not be representable). - The model used to define the accuracy requirements for operations of fixed-point types is much more complicated than it needs to be, and many of its freedoms have never, in fact, been exploited. The accuracy achieved by operations of fixed-point types in a given implementation is ultimately determined, in Ada 83, by the safe numbers of the type, just as for floating-point types, and indeed the safe numbers can, and in some implementations do, have more precision than the model numbers. However, the model in Ada 83 allows the values of a real type (either fixed or float) to have arbitrarily greater precision than the safe numbers, i.e., to lie between safe numbers on the real number axis; implementations of fixed point typically do not exploit this freedom. Thus, the opportunity to perturb an operand value within its operand interval, although allowed, does not arise in the case of fixed point, since the operands are safe numbers to begin with. In a similar way, the opportunity to select any result within the result interval is not exploited by current implementations, which we believe always produce a safe number; furthermore, in many cases (i.e., for some operations) the result interval contains just a single safe number anyway, given that the operands are safe numbers, and it ought to be more readily apparent that the result is exact in these cases. - Support for fixed-point types is spotty, due to the difficulty of dealing accurately with multiplications and divisions having ``incompatible smalls'' as well as fixed-point multiplications, divisions, and conversions yielding a result of an integer or floating-point type. Algorithms have been published in [Hilfinger 90], but these are somewhat complicated and do not quite cover all cases, leading to implementations that do not support representation clauses for SMALL and that, therefore, only support binary smalls. These problems are partly the result of trying to make fixed-point types serve several needs and several application areas, none of which are served perfectly and all of which are compromised somewhat, as discussed below. - One of the intended applications for fixed-point types is sensor-based applications, where the representations of scaled physical quantities are transmitted over ports as binary integers. Digital signal processing is a related application area with a similar focus on manipulating scaled binary integers. These needs are met fairly well, because either no representation clauses for SMALL are used (the delta already being a power of two) or representation clauses for universally accepted values of SMALL are used. - Fixed-point types are intended, or at least they have been considered, for applications in the Information Systems area, i.e. to represent financial quantities that are typically integer multiples of decimal fractions of some monetary denomination. This need is not met well, since extra precision is generally intolerable in such applications, rounding needs to be controlled, and there is no guarantee that decimal scaling factors are supported (because they require the use of representation clauses). Many fixed-point implementations limit ranges to the equivalent of about ten decimal digits, which is inadequate for some IS applications. In addition, IS applications often need multiple representations of decimal data, e.g. for computation versus display. The fixed-point model in Ada 83 is heavily biased towards an internal representation of fixed-point types as a binary integer, and this bias strongly affects the range of the | unconstrained (base) subtype of the type. Specifying a range as, | for example, -9_999_999.99 .. 9_999_999.99 is considered cumbersome; in any case, it does not guarantee protection against exceeding the range in computations of the type. | - Finally, fixed-point types are often embraced as a kind of cheap floating point, suitable on hardware lacking true floating point when the application manipulates values from a severely restricted range. This need may be met well, in the sense that efficient performance may be expected when the small is allowed to default and the user holds no expectations that multiples of the delta are exactly represented, but it has influenced the design of the facility too heavily and it compromises the quality of what can be offered in the other application areas. It is not clear that this application of fixed point is much used or needed. Our solution to these problems is to remove some of the freedoms of the interval-based accuracy requirements that have never been exploited and to relax the accuracy requirements so as to encourage wider support for fixed point. Applications that use binary scaling and/or carefully matched (``compatible'') scale factors in multiplications and divisions, which is typical of sensor-based and other embedded applications, will see no loss of accuracy or efficiency. It is not our intention to meet the special needs of IS applications; they are addressed by the new decimal fixed-point types defined in the Information Systems area of the Special Needs Annex (see Section IS:ALL), although undemanding applications in this area may coincidentally be served marginally better by ordinary fixed-point types than they were in Ada 83. While the revamped fixed-point facility removes and relaxes requirements that have generally not been exploited, it does not go as far as we had hoped in substituting intuitive behavior for the surprises of the past. Version 4.1 of the Numerics Annex proposed to eliminate the concept of small as distinct from delta, making the values of a user-declared fixed-point type integer multiples of the declared delta. Although this proposal had significant support, it was judged by others to represent too radical a change and to produce too many incompatibilities. It would have caused programs using fixed-point types with a delta that is not a power of two and a default small to substitute different sets of values for those types and to perform scaling by multiplication and division instead of shifting. By retaining the concept of small as distinct from delta, as well as a default rule for small that is analogous to the Ada 83 rule, we have in the present version of the Numerics Annex avoided the need for any fixed-point type to change its behavior. The default small in Ada 9X is an implementation-defined power of two less than or equal to the delta, whereas in Ada 83 it was defined to be the largest power of two less than or equal to the delta. The purpose of this change is merely to allow implementations that previously used extra bits in the representation of a fixed-point type for increased precision rather than for increased range, giving the safe numbers more precision than the model numbers, to continue to do so. An implementation that does so must, however, accept the minor incompatibility represented by the fact that the type's default small will differ from its value in Ada 83. Implementations that used extra bits for extra range have no reason to change their default choice of small, even though Ada 9X allows them to do so. Note that our simplification of the accuracy requirements, i.e., expressing them directly in terms of certain sets of integer multiples of the result type's small rather than in terms of model or safe intervals, removes the need for some of the attributes of model and safe numbers of fixed-point types. To the extent that these attributes are used in Ada 83 programs, the elimination of these attributes poses a potential incompatibility problem. As we did for floating-point types, we recommend that implementations continue to provide these attributes as implementation-defined attributes, with their Ada 83 values, and that implementations produce warning messages upon detecting their use. We had hoped to go so far as to remove, in support of Requirement R2.2-B(1), the potential surprise when a range bound that is a power of two times the small is not included within the range of a fixed-point type, by including all the integer multiples of the small in the declared bounds within the range of the type, arguing that declarations that change their meaning as a result can be rewritten to achieve the desired effect. But we could not argue that few programs would be affected by this change; in other words, even though this property of Ada 83 has the potential for surprise with some programs, it is used correctly by far more. The feature remains, but it is reflected by a different mechanism now that the concepts of model numbers and their mantissas have been dropped from the fixed-point description. Some of the accuracy requirements, i.e. those for the adding operators and comparisons, now simply say that the result is exact. This was always the case in Ada 83, assuming operands are always safe numbers there, and yet it is not clear from the model-interval form of the accuracy requirements that comparison of fixed-point quantities is, in practice, deterministic and need not be otherwise. Other accuracy requirements are now expressed in terms of small sets of allowable results, called ``perfect result sets'' or ``close result sets'' depending on the amount of accuracy that it is practical to require; these sets always contain consecutive integer multiples of the result type's small (or of a ``virtual'' small of 1.0 in the case of multiplication or division with an integer result type). In some cases, the sets are seen to contain a single such multiple or a pair of consecutive multiples; this clearly translates into a requirement that the result be exact, if possible, but never off by more that one rounding error or truncation error. The cases in which this occurs are the fixed-point multiplications and divisions in which the operand and result smalls are ``compatible,'' meaning that the product or quotient of the operand smalls (depending on whether the operation is a multiplication or a division) is either an integer multiple of the result small, or vice versa. (These cases cover much of the careful matching of types typically exhibited by sensor-based and other embedded applications, which are intended to produce exact results for multiplications and at-most-one-rounding-error results for divisions, with no extra code for scaling; they can produce the same results in Ada 9X, and with the same efficient implementation. Our definition of ``compatible'' is more general than required just to cover those cases of careful matching of operand and result types, permitting some multiplications that require scaling of the result by at worst a single integer division, with an error no worse than one rounding error.) For other cases (when the smalls are ``incompatible''), the accuracy requirements are relaxed, in support of Requirement R2.2-A(1); in fact, they are left implementation defined. Implementations need not go so far as to use the Hilfinger algorithms [Hilfinger 90], though they may of course do so. An Ada 9X implementation could, for instance, perform all necessary scaling on the result of a multiplication or division by a single integer multiplication or division (or shifting). That is, the efficiency for the cases of incompatible smalls need not be less than that for the cases of compatible smalls. This relaxation of the requirements is intended to encourage support for a wider range of smalls. Indeed, we considered making support for all smalls mandatory on the grounds that the relaxed requirements removed all barriers to practical support for arbitrary smalls, but we rejected it because it would make many existing implementations instantly nonconforming. A recent change in the definition of ``perfect result set'' requires | positive results to be truncated (rounded toward zero), unless | 'MACHINE_ROUNDS is TRUE for the target type. Previously, the rounding could | be in either direction in this case. It has been argued that this is the | behavior that most systems exhibit anyway, and it achieves the desirable | goal of rendering most calculations with the DURATION predefined fixed-point | type deterministic. DEC rounds in this case, and it can continue to do so | if 'MACHINE_ROUNDS is TRUE for the target type. There continues to be no | preference for the direction of rounding in the case of negative results, if | 'MACHINE_ROUNDS is FALSE for the target type, since the direction will be | dependent on whether the implementation uses a radix-complement | representation or some other representation. | Ada 9X allows an operand of fixed-point multiplication or division to be a real literal, named number, or attribute. Since the value V of that operand can always be factored as an integer multiple of a compatible small, the operation must be performed with no more than one rounding error and will cost no more than one integer multiplication or division for scaling. Note: That V can always be factored in this way follows from the fact that it, and the smalls of the other operand and the result, are necessarily all rational quantities. The accuracy requirements for fixed-point multiplication, division, and conversion to a floating-point target are left implementation defined | (except when the operands' smalls are powers of the target's machine radix) | because the implementation techniques described in [Hilfinger 90] rely on the availability of several extra bits in typical floating-point representations beyond those belonging to the Ada 83 safe numbers; with the revision of the floating-point model, in particular the elimination of the quantization of the mantissa lengths of model numbers, those bits are now likely gone. Except when the operands' smalls are powers of the target's | machine radix, requiring model-number accuracy for these operations would | demand implementation techniques that are more exacting, expensive, and complicated than those in [Hilfinger 90], or it would result in penalizing the mantissa length of the model numbers of a floating-point type just to recover those bits for this one relatively unimportant operation. An | implementation may use the simple techniques in [Hilfinger 90] for fixed-point multiplication, division, and conversion to a floating-point target; the accuracy achieved will be exactly as in Ada 83, but will simply not be categorizable as model-number accuracy, unless the operands' smalls | are powers of the target's hardware radix. Furthermore, in the latter case, | even simpler algorithms are available. | We have abandoned an idea we first put forth in Version 4.0 of the Mapping Specification, namely, that of allowing the radix of the representation of an ordinary (i.e., non-decimal) fixed-point type to be specified as ten, by a representation clause (attribute definition clause in Ada 9X); the current bias towards a radix of two would persist as the default. We abandoned this idea because it benefits primarily IS applications, which are now addressed by separate features. This feature would integrate well with the rest of our proposal, were it to be restored. Its primary semantic effect would be to exclude range bounds that are powers of ten from necessarily being included within the bounds of the type. It would permit bounds like 999_999_999.99 in the declaration of a type whose small is .01 to be written instead as 1_000_000_000.00 or even as 0.01E+12, which is close to the ``digits 11'' shorthand provided by IS:DECIMAL_FIXED_POINT. Since it would be of no particular benefit outside of IS applications, and since it is only a minuscule part of the totality of additional support required in the IS area, it is not worth adding to ordinary fixed point. A suggestion for a new representation attribute, T'MACHINE_SATURATES, has been made. Some digital signal processors do not signal a fault or wrap around upon overflow but instead saturate at the most positive or negative value of a type. It could be useful to detect and describe that behavior by means of the suggested attribute. We leave this as a subject for future exploration, perhaps in conjunction with similar refinements of T'MACHINE_OVERFLOWS suggested (for floating-point types) by IEEE arithmetic. S.L.3. Elementary Functions For a general rationale for the elementary functions, the reader is referred to [Dritz 91a]. These functions are critical to a wide variety of scientific and engineering applications written in Ada. They have been widely provided in the past as vendor extensions with no standardized interface and with no guarantee of accuracy. These impediments to portability and to analysis of programs are removed by their inclusion in the Numerics Area features of Ada 9X, in support of Requirement R11.1-A(1). The elementary functions are provided in Ada 9X by a new predefined generic package, GENERIC_ELEMENTARY_FUNCTIONS, which is a very slight variation of that proposed in ISO DIS 11430, ``Proposed Standard for a Generic Package of Elementary Functions for Ada.'' The Ada 9X version capitalizes on a feature of Ada 9X (use of T'BASE as a type mark in declarations) not available in the environment (Ada 83) to which the DIS is targeted. The feature has been used here to declare the formal parameter types and result types of the elementary functions to be the unconstrained (base) subtype of the generic | formal type, eliminating the possibility of range violations at the interface. The same feature can be used for local variables in the body of GENERIC_ELEMENTARY_FUNCTIONS (if it is programmed in Ada) to avoid spurious exceptions caused by range violations on assignments to local variables of the generic formal type. Thus, there is no longer a need to allow implementations to impose the restriction that the generic actual subtype in | an instantiation must be an unconstrained subtype; implementations must | allow a range-constrained subtype as the generic actual subtype, and they | must be immune to the potential effects of the range constraint. Some hardware implementations of the elementary functions do not provide the | accuracy required by ISO DIS 11430, which is more stringent. (They do not, | for example, use a sufficiently accurate representation of pi for | trigonometric argument reduction.) Even though the specification of | GENERIC_ELEMENTARY_FUNCTIONS will be in the core, we considered not | including the accuracy requirements of ISO DIS 11430 there, but rather | retaining them only in the Numerics Annex. This would allow Ada | implementations that elect, for reasons of efficiency, to compute the | elementary functions in hardware to conform. We concluded, however, that it | was better to require all conforming implementations of the elementary | functions to meet the accuracy requirements, which are not particularly | burdensome. (It has been observed that most vendors of serious mathematical | libraries, including the hardware vendors, are now committing themselves to | implementations that are fully accurate throughout the domain, since | practical software techniques for achieving that accuracy are becoming more | widely known.) An implementation that could not meet the accuracy | requirements with hardware implementations of the elementary functions could | choose to provide non-conforming hardware implementations in addition to | conforming software implementations. | An implementation that accommodates signed zeros (i.e., one for which FLOAT_TYPE'SIGNED_ZEROS is TRUE) is required to exploit them in several important contexts, in particular the signs of the zero results from the ``odd'' functions SIN, TAN, and their inverses and hyperbolic analogs, at the origin, and the sign of the half-cycle result from ARCTAN and ARCCOT; this follows a recommendation, in [Kahan 87], that provides important benefits for complex elementary functions built upon the real elementary functions, and for applications in conformal mapping. Exploitation of signed zeros at the many other places where elementary functions can return zero results is left implementation defined, since no obvious guidelines exist for these cases. S.L.4. Primitive Functions For a general rationale for the primitive functions, the reader is referred to [Dritz 91b]. They are required for high-quality, portable, efficient mathematical software such as is provided in libraries of special-function routines, and some are of value even for more mundane uses, like I/O conversions and software testing. The primitive functions are provided in support of Requirement R11.1-A(1). The casting of the primitive functions as attributes, rather than as functions in a generic package (e.g., GENERIC_PRIMITIVE_FUNCTIONS, as defined for Ada 83 in ISO CD 11729, ``Proposed Standard for a Generic Package of Primitive Functions for Ada''), befits their primitive nature and allows them to be used as components of static expressions, when the arguments are static. MAX and MIN are particularly useful in this regard, since they are sometimes needed in expressions in numeric type declarations, for example to ensure that a requested precision is limited to the maximum allowed. The functionality of SUCCESSOR and PREDECESSOR, from the proposed GENERIC_PRIMITIVE_FUNCTIONS standard, is provided by extending the existing attributes SUCC and PRED to floating-point types. Note that T'SUCC(0.0) returns the smallest positive number, which is a denormalized number if T'DENORM is TRUE and a normalized number if T'DENORM is FALSE; this is equivalent to the ``fmin'' derived constant of LCAS [ISO/IEC 91]. Most of the other constants and operations of LCAS are provided either as primitive functions or other attributes in Ada 9X; those that are absent can be reliably defined in terms of existing attributes. The proposed separate standard for GENERIC_PRIMITIVE_FUNCTIONS stated that the primitive functions accept and deliver machine numbers, which implies that they never receive arguments in extended registers. Conceptually, that requirement could be removed in Ada 9X, though we are by no means certain that it is wise to do so, and we are still investigating the issue. If the primitive functions always receive machine numbers, then, for example, the result of T'EXPONENT(X) can be assumed to be in the range T'MIN(T'EXPONENT(T'PRED(0.0)), T'EXPONENT(T'SUCC(0.0))) .. T'MAX(T'EXPONENT(T'BASE'FIRST), T'EXPONENT(T'BASE'LAST)) and an integer type with that range can be declared to hold any value that can be returned by T'EXPONENT(X). (These bounds accommodate the fact that T'EXPONENT of a denormalized number returns a value less than T'MACHINE_EMIN, and they also accommodate implementations that may use radix-complement representation.) However, if we define the primitive functions so that they must accept the range of arguments that they might receive in extended registers, then we cannot bound the results of T'EXPONENT(X) by properties of the implementation, since the range of extended registers is nowhere reflected in such properties. In that case, one would be advised to construct an integer type of widest available range (SYSTEM.MIN_INT .. SYSTEM.MAX_INT) for the type of a variable used to hold values delivered by the EXPONENT attribute. If extended range and precision are allowed in the arguments of the primitive functions, T'SUCC, T'PRED, and T'ADJACENT will, nevertheless, deliver machine numbers of the type T. One primitive function that will be allowed to receive an argument in an extended register is T'MACHINE(X), an attribute that was not represented by a function in GENERIC_PRIMITIVE_FUNCTIONS. This attribute exists specifically to give the programmer a way to discard excess precision if the implementation happens to be using it, and if the details of an algorithm are sensitive to its use. It also has the side effect of guaranteeing that a value outside the range T'BASE'FIRST .. T'BASE'LAST is not propagated. The attribute is a no-op in implementations that do not use extended registers. Its definition allows efficient implementations on representative hardware. Thus, on IEEE hardware, it may be implemented simply by storing an extended register into the shorter storage format of the target type T; on implementations having types with extra precision but not extra exponent range, it may be implemented by storing the high-order part of a register pair into storage. Overflow may occur in the former case but cannot occur in the latter; in both cases, values slightly outside the range T'BASE'FIRST .. T'BASE'LAST can escape overflow by being rounded to an endpoint of the range. (This actually happens on IEEE hardware.) The related primitive function T'MODEL(X) also accepts its argument in an extended register and shortens the result to a machine number. In this case, however, the loss of low-order digits is potentially more severe. The result is guaranteed to be a model number within the range -T'MODEL_LARGE .. T'MODEL_LARGE. This function returns its floating-point argument perturbed to a nearby model number (if it is not already a model number) in the same way that is allowed for operands and results of the predefined arithmetic operations (see L.1.5), so it introduces no more error than what is already allowed. By forcing a quantity to a nearby model number, it guarantees that subsequent arithmetic operations and comparisons with the number will experience no further perturbation and will therefore produce predictable and consistent results. For example, suppose we have a situation like if X > 1.0 then ... -- several references to X ... end if; in which X can be extremely close to 1.0. If X is in the first model interval above 1.0, the semantics of floating-point arithmetic allow the references to X inside the if statement to behave as if they had the value 1.0, seemingly contradicting the condition that allows entry there, and multiple references could behave as if they yielded slightly different values. If this is intolerable, then one can write Y := T'MODEL(X); if Y > 1.0 then ... -- several references to Y ... end if; The value of Y can be no worse than some value already allowed for the result of the operation that produced X. If the if statement is entered, we are guaranteed that Y exceeds 1.0 and that all references to it yield the same value. If X has a value slightly exceeding 1.0, the if statement might not be entered, but that was also true in the earlier example. In implementations in which the model numbers coincide with the machine numbers, T'MODEL reduces to T'MACHINE, and if in that case extended registers are not being used, both are no-ops. S.L.5. Complex Arithmetic | | Many numerical application areas depend on the use of complex arithmetic. | Complex fast Fourier transforms are used, for example, in conjunction with | radar, sonar, and electro-optical sensors; conformal mapping uses complex | arithmetic in fluid-flow problems, including the analysis of velocity fields | around radomes, torpedo noses, and airfoils; and a/c circuit analysis is | classically modeled in terms of complex exponentials. | | Two generic packages that support applications of complex arithmetic are | defined in the Numerics Annex. One, called GENERIC_COMPLEX_TYPES, defines a | complex type and the usual arithmetic operations on objects of the type, | including appropriate operations on combinations of real and complex | operands. The other, called GENERIC_COMPLEX_ELEMENTARY_FUNCTIONS, provides | a complete set of complex elementary functions. | | These generic packages are based on the work of the SIGAda Numerics Working | Group, the Ada-Europe Numerics Working Group, and the WG9 Numerics | Rapporteur Group, which is expected to lead to secondary ISO standards (for | Ada 83) in this area. | | The specification for GENERIC_COMPLEX_TYPES is an adaptation of that | proposed as a separate ISO standard for Ada 83. The minor differences | exploit new features of Ada 9X. The major differences amount to a | subsetting of that proposal: complex vector and matrix types and associated | operations have been omitted, resulting in a much smaller generic package; a | related generic package of real types, which is primarily valued for its | real vector and matrix types and associated operations, has been omitted; | and a generic package of mixed real/complex vector and matrix operations has | been omitted. Ignoring the vector and matrix types and operations is | justified on the grounds that the language features for supporting vectors | and matrices are not as well developed in Ada 9X as in Fortran 90, lessening | the incentive to ``compete'' in this area; furthermore, the structure and | relationship of the various generic packages proposed in the separate ISO | standard, which are mostly concerned with vector and matrix issues, are | controversial and have not yet been evaluated by the WG9/SC22/JTC1 | hierarchy, making their adoption for Ada 9X premature. No controversies | surround the ``scalar'' complex type. | | For a general rationale for the generic package of complex types, the reader | is referred to [Hodgson 91] until a revision is published. | | As explained there, the COMPLEX type is defined as a visible type, rather | than a private type, so that arbitrary complex constants can be expressed as | aggregates in a style that is familiar from other languages. As in | GENERIC_ELEMENTARY_FUNCTIONS (see S.L.3), the Ada 9X version of | GENERIC_COMPLEX_TYPES capitalizes on the use of T'BASE as a type mark to | declare the formal parameter types and result types of the included | operations, where they are real (i.e., not complex), to be of the | unconstrained (base) subtype of the generic formal type. The analogous | benefit for the complex formal parameter types and result types is achieved | by using the unconstrained (base) subtype of the generic formal type for the | subtype of the components of the COMPLEX type. | | The two-parameter composition function COMPOSE_FROM_CARTESIAN constructs a | complex value from the given real and imaginary parts. The one-parameter | form of this composition function ``promotes'' a real value to a complex | value with a zero imaginary part. The two variants could easily have been | combined into a single function using a default value of zero for the second | formal parameter (corresponding to the imaginary part). The one-parameter | form is provided separately, however, as the complex equivalent of the real | (i.e., not complex) unary addition operator. A general (real or complex) | generic package for, say, the solution of linear equations, part of whose | generic formal part is given by | | type FLOAT_TYPE is digits <>; | type SCALAR_TYPE is private; | with function CONVERT (X : FLOAT_TYPE) return SCALAR_TYPE; | | can then be instantiated with CONVERT getting either "+" or | COMPOSE_FROM_CARTESIAN. In the former case, both FLOAT_TYPE and SCALAR_TYPE | would get the same floating-point type; in the latter case, SCALAR_TYPE | would get an appropriate complex type. | | The Cartesian representation was chosen over a polar representation for the | complex type to avoid canonicalization problems and because it is more | commonly encountered in practice. An explicit choice of representation is | required in any case to give meaning to the accuracy requirements. The | functions MODULUS (renamed abs), ARGUMENT, and COMPOSE_FROM_POLAR provide | for the decomposition of a complex value in Cartesian form into polar | components and the composition of a complex value in Cartesian form from | polar components. | | Mixed-mode (i.e., complex/real) operations are provided to allow | implementations to take advantage of the fact that one operand has a zero | imaginary part, which reduces the number of predefined operations that must | be performed. | | Implementations are required to provide standard instantiations of | GENERIC_COMPLEX_TYPES (e.g., COMPLEX_TYPES, with FLOAT_TYPE => FLOAT; | SHORT_COMPLEX_TYPES, with FLOAT_TYPE => SHORT_FLOAT; LONG_COMPLEX_TYPES, | with FLOAT_TYPE => LONG_FLOAT; etc.) so that applications can effectively | treat single-precision complex, double-precision complex, etc. as predefined | types with the same convenience that is provided for the predefined | floating-point types (or, more importantly, so that independent libraries | assuming the existence and availability of such types without the use of | generics can be constructed and freely used in applications). Note: UI-0048 | requires that SHORT_FLOAT be used for a floating-point type with a 'SIZE | near 32 bits, and that LONG_FLOAT be used for a floating-point type with a | 'SIZE near 64 bits; these are the customary single- and double-precision | sizes, respectively. | | The specification of GENERIC_COMPLEX_ELEMENTARY_FUNCTIONS is an adaptation | of that proposed as a separate ISO standard for Ada 83. The only difference | is in the generic formal part, which in Ada 9X specifies a generic formal | package instantiation (i.e., of GENERIC_COMPLEX_TYPES) as the sole generic | formal parameter. Although this ties GENERIC_COMPLEX_ELEMENTARY_FUNCTIONS | more tightly to GENERIC_COMPLEX_TYPES in Ada 9X, making it more difficult | for the user to couple the former with his or her own definition of COMPLEX, | it provides for more flexibility and efficiency by not obliging | implementations of GENERIC_COMPLEX_ELEMENTARY_FUNCTIONS to perform all their | computations in the real domain after decomposing complex arguments into | their real and imaginary components. | | For a general rationale for the generic package of complex elementary | functions, the reader is referred to [Squire 91] until a revision is | published. | | We have resisted the temptation to scale back the set of functions provided | in GENERIC_COMPLEX_ELEMENTARY_FUNCTIONS (for example, to the much smaller | set of complex elementary functions provided by Fortran) on the grounds that | uniformity with GENERIC_ELEMENTARY_FUNCTIONS is important (i.e., the | contents of the former can be more easily remembered if they are essentially | the same as that of the latter). | | | | S.L.6. Interface to Fortran | | In the past, much mathematical software has been programmed in Fortran, and | this is likely to continue into the future. It is only fitting, therefore, | that the Ada 9X Numerics Annex contain a provision to facilitate zygolingual | programming in Fortran and Ada. The features for interfacing to Fortran | take the form of support for a language name of FORTRAN in the IMPORT, | EXPORT, and LANGUAGE pragmas and a predefined package of Fortran-compatible | types called FORTRAN_TYPES, both of which must be provided by Numerics Annex | conforming implementations if Fortran is available in the same environment. | | The types declared in FORTRAN_TYPES are those corresponding to the default | representations of the Fortran scalar types. It is intended that objects or | values of such types be capable of being passed between Ada and Fortran | without conversion; thus, the representation in Ada must be the same as that | used by Fortran in the same environment. The specification of FORTRAN_TYPES | may be supplemented by pragmas as necessary to achieve this goal. | | The complex type defined in FORTRAN_TYPES is derived from the complex type | exported by an appropriate instantiation of GENERIC_COMPLEX_TYPES (see | S.L.5). | | The Numerics Annex requires that a version of FORTRAN_TYPES be provided for | each implementation of Fortran that is available in the same environment; if | more than one version is provided, then the names of the packages must | identify the implementation of Fortran in an implementation-defined manner. | Furthermore, each version must contain the definitions of any additional | types that are provided by the corresponding Fortran implementation, named | as follows: for an implementation of Fortran (most likely Fortran 77) that | provides types like INTEGER*n, LOGICAL*n, etc., the additional types shall | be named INTEGER_n, LOGICAL_n, and so forth; for an implementation of | Fortran 90 that provides multiple ``kinds'' of types, like INTEGER (KIND=n), | LOGICAL (KIND=n), etc., the additional types shall be named INTEGER_KIND_n, | LOGICAL_KIND_n, and so forth. This convention does not extend to CHARACTER, | however; a type compatible with the Fortran type CHARACTER*n is obtained as | the constrained array subtype FORTRAN_TYPES.CHARACTER(1..n). | | FORTRAN_TYPES and its versions are inherently implementation defined. It | should be clear that this feature is provided not to make it easier for | Fortran programmers to write portable numerical software in Ada, but only to | make it easier to interface to existing Fortran code within a given | environment. | References [Brown 81] W. S. Brown. A simple but realistic model of floating-point computation. TOMS 7(4):445-480, December, 1981. [Dritz 91a] K. W. Dritz. Rationale for the Proposed Standard for a Generic Package of Elementary Functions for Ada. Ada Letters XI(7):47-65, Fall, 1991. [Dritz 91b] K. W. Dritz. Rationale for the Proposed Standard for a Generic Package of Primitive Functions for Ada. Ada Letters XI(7):82-90, Fall, 1991. [Hilfinger 90] P. N. Hilfinger. Implementing Ada Fixed-point Types Having Arbitrary Scales. Technical Report Report No. UCB/CSD 90/#582, University of California, Berkeley, CA, June, 1990. [Hodgson 91] G. S. Hodgson. Rationale for the Proposed Standard for Packages of Real and Complex Type Declarations and Basic Operations for Ada (including Vector and Matrix Types). Ada Letters XI(7):131-139, Fall, 1991. [IEEE 85] Standard for Binary Floating-Point Arithmetic ANSI/IEEE Std. 754-1985; ISO/IEC 559:1989 edition, IEEE, 1985. [IEEE 87] Standard for Radix-Independent Floating-Point Arithmetic ANSI/IEEE Std. 854-1987 edition, 1987. [ISO/IEC 91] Information technology -- Programming languages -- Language compatible arithmetic DIS 10967 edition, ISO/IEC, 1991. [Kahan 87] W. Kahan. Branch Cuts for Complex Elementary Functions, or Much Ado About Nothing's Sign Bit. The State of the Art in Numerical Analysis. Clarendon Press, 1987, Chapter 7. [Squire 91] J. S. Squire. Rationale for the Proposed Standard for a Generic Package of Complex Elementary Functions. Ada Letters XI(7):166-179, Fall, 1991. Index Canonical form definition L-1 of denormalized floating-point machine numbers L-2 of floating-point model numbers L-2 of normalized floating-point machine numbers L-1 Complex arithmetic L-5 Complex elementary functions L-6 DENORM (new predefined attribute) L-2 Denormalized numbers L-2 Elementary functions L-4 ELEMENTARY_FUNCTIONS_EXCEPTIONS (predefined package) L-4 Fixed point L-3 Fixed-point accuracy requirements L-3 arithmetic model L-3 attributes L-3 attributes of model numbers eliminated L-4 model numbers eliminated L-4 type declarations L-3 values L-3 Floating-point L-1 accuracy requirements L-2 arithmetic model L-1 attributes of machine numbers L-1 attributes of model numbers L-2 denormalized machine numbers L-2 machine numbers L-1 model numbers L-2 type declarations L-3 Fortran interface L-6 FORTRAN_TYPES (predefined package) L-6 GENERIC_COMPLEX_ELEMENTARY_FUNCTIONS (predefined generic package) L-6 GENERIC_COMPLEX_TYPES (predefined generic package) L-5 GENERIC_ELEMENTARY_FUNCTIONS (predefined generic package) L-4 Interface to Fortran L-6 Model numbers L-2 Primitive functions (new predefined floating-point attributes) L-5 Random number generators L-7 Signed zeros L-2, L-5, L-6 SIGNED_ZEROS (new predefined attribute) L-2