!topic LSN-1076 on Wraparound (i.e. modular/unsigned) types !key LSN-1076 on Wraparound (i.e. modular/unsigned) types !reference RM9X-3.5.4;4.0 !reference RM9X-4.6(28-33);4.0 !reference RM9X-4.2(9);4.0 !reference 93-3322.a Randy Brukardt 93-11-18 modular conversion !reference 93-3324.a Norman Cohen 93-11-19 modular conversion !reference 93-3325.a Tucker Taft 93-11-19 modular conversion !reference 93-3327.a Norman Cohen 93-11-19 modular conversion !reference 93-3426.a Randy Brukardt 93-12-08 modular not !reference 93-3428.a Ted Baker 93-12-09 nonbinary modulus !reference 93-3429.a Tucker Taft 93-12-09 nonbinary modulus !reference 93-3431.a Tucker Taft 93-12-09 modular not !reference 93-3450.a Randy Brukardt 93-12-10 modular not !reference 93-3472.a Philippe Kruchten 93-12-16 nonbinary modulus !reference 93-3476.a Norman Cohen 93-12-16 nonbinary modulus !reference 93-3486.a Randy Brukardt 93-12-17 array indexing !reference 94-3613.a Robert Eachus 94-01-05 nonbinary modulus !reference 94-3620.a Norman Cohen 94-01-06 nonbinary modulus !reference 94-3622.a Robert Eachus 94-01-06 nonbinary modulus !reference 94-3625.a Norman Cohen 94-01-07 nonbinary modulus !reference 94-3629.a Robert Eachus 94-01-07 nonbinary modulus !reference 94-3630.a Randy Brukardt 94-01-07 nonbinary modulus !reference 94-3661.d Swiss Delegation 94-01-13 subranges !from Tucker Taft $Date: 94/02/14 18:18:37 $ $Revision: 1.3 $ !discussion As illustrated by the set of references of this LSN, there have been a number of comments relating to the rules for "modular" types, including suggestions for changing their syntax and name. To avoid prejudicing the discussion, I will refer in this LSN to types with "wraparound" semantics as "wraparound" types. This LSN covers the following issues: a) conversion to/from other numeric types; static/dynamic b) array indexing and subranges c) 1's complement issues d) nonbinary moduli i) implementability, etc. ii) definition of "not", "and", "or", "xor" iii) usefulness/elegance/etc. e) overall model (modular congruence vs. wraparound operators) For those who like to peek ahead to the conclusion, this LSN makes the following recommendations: a) conversion to and/or from wraparound types should raise Constraint_Error if the value, prior to any "wrapping around," is outside the range of the target subtype; that is, no wraparound is performed as part of conversion. b) the rules for null string literals should be altered to specify that Constraint_Error is raised if the low bound equals the low bound of the base range of the index type, rather than talking about the "predecessor." Subranges should still be supported, to allow wraparound subtypes to be used for indexing into arrays. c) 1's complement machines should be accommodated by giving explicit permission to have a wraparound type where its arithmetic operators wrap around at 2**N-1, even though the type can represent 2**N distinct bitpatterns. d) nonbinary moduli should be supported, retaining the current advice of Integer'Last as the Max_Nonbinary_Modulus; as pointed out in various messages, this is a useful generalization of unsigned, and is consistent with Ada's other value-oriented (rather than implementation-oriented) numeric models. e) The model should be simplified, by expressing it in terms of wraparound rules for specific arithmetic operations, rather than in terms of equivalence classes based on modular congruence; that is, the term "principal" value would be dropped -- there would be no "secondary" values. --------------------------------- CONVERSION RULES In RM9X-4.6(28-30), wraparound types are handled specially with respect to conversion. Wraparound is performed both on conversion to a wraparound type, and conversion from a wraparound type to a signed integer type. Note that wraparound is not performed on conversion to or from a floating point type (RM9X-4.6(32-33), nor on implicit conversion from a static universal integer (since RM9X-4.6(36) says it is illegal to be outside the base range of the expected type). Several comments have indicated discomfort with the rule of wrapping around on conversion from a wraparound type to a signed type, particularly because whether to wraparound is determined by the target *sub*type, rather than the target type. The rationale for the current rule is given in 93-3325.a and also in AARM-4.6(30.a-30.c);4.0. Given the discomfort with the current rule, it seems reasonable to consider changing the rule to require Constraint_Error when going from wraparound to signed if the value (prior to any adjustment) is outside the target subtype. However, it seems inconsistent (to the MRT) to happily convert -32768 to +32768 when going from signed to wraparound (mod 2**16), and then not convert back to -32768 when going the other way if the target signed subtype is only 16 bits. Hence, we would recommend that if Constraint_Error is expected going one way, it should also be raised going the other way. Furthermore, in the December DR/XRG meeting, the minutes report that "the general model of type conversion has always been that the converted value was the same as the resulting value after the expected necessary rounding/sliding." This "general model" would seem to apply to both directions, if it applies to either direction of conversion between wraparound and signed. We therefore recommend making the rule symmetrical, namely Constraint_Error is raised on conversion to and/or from a wraparound type based on whether the value being converted is in the range of the target subtype. No adjustment (other than rounding if converting from real) is performed prior to the check. This preserves the nice property that on conversion of an IN OUT parameter, if Constraint_Error is not raised going "in" to the procedure, and the procedure doesn't update the parameter, one can reasonably expect no Constraint_Error on the way out (real<->integer conversion of an IN OUT parameter could still break this rule in some weird edge cases due to rounding). Unchecked conversion, or the explicit use of "mod" would be necessary to do a "conversion" that does wraparound. ARRAY INDEXING AND SUBRANGES Randy Brukardt in 93-3486.a points out a couple of problems relating to indexing arrays with wraparound types: i) null slices where the low bound = the Wraparound_Type'First. ii) null string literals where the low bound = Wraparound_Type'First. As an example of (i): type Nibble is . type Nibble_String is array(Nibble range <>) of Character; Nibble_Names : Nibble_String(Nibble) := ('0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F'); X_Len : constant := ... complicated expression ...; X : constant Nibble_String := Nibble_Names(Nibble'First .. Nibble'First + X_Len - 1); -- ***** If X_Len = 0, then X'Last is (-1) mod 2**4 = 15 -- ***** and hence X'Length = 16, not 0! As an example of (ii): if X = "" then ... -- **** What is high bound of null string literal? -1 = 15 mod 16? Problem (ii) is relatively easily "fixed" by saying that the evaluation of a null string literal raises Constraint_Error if the low bound of the literal is equal to the low bound of the base range of the index type. Currently the rule in 4.2(10) says that a check is made that the low bound "has a predecessor in the base range of the index type." This is easily changed to say a check is made that the low bound is not equal to the low bound of the base range of the index type. Of course, the user will now get a Constraint_Error (and hopefully a warning at compile-time), rather than a null string literal. Problem (i) is not easily fixed, and we don't propose to do so. Perhaps a NOTE in the RM warning of this problem would be useful. There are actually two possible outcomes. If the bounds of the array go from the low bound to the high bound of the index type (as in this example), then no Constraint_Error is raised, but the slice results in the whole array, rather than a null slice. The other possible outcome is when the index subtype does not cover the whole base range of the index type, in which case Constraint_Error would be raised since index_subtype'First-1 would not be in index_subtype'Range, even if it wraps around. Since null slices are relatively rare, and arrays whose index covers the full base range are also rare, we expect a null slice of such an array to be exceedingly rare. Conceivably a very friendly compiler could produce a warning when something like this might occur. In any case, we don't see any viable solution to problem (i), so we have to hope it will be rare! The Swiss delegation wondered whether range constraints on wraparound types were sufficiently useful to allow them. However, we certainly expect users to use wraparound types for array indexing and for-loops. Both of these contexts presume the use of range constraints to define the index subtype of the array (or a slice thereof), and the range of the for-loop. Disallowing named subtypes of wraparound types is possible, but that would preclude the use of named subtypes to define the index subtype of an array indexed by a wraparound type. Hence, we recommend that range constraints be allowed for wraparound types. Our expectation is that by simplifying the conversion rules (see above) and the general "wraparound" model (see below), the underlying desire for simplification will be achieved. ONE'S COMPLEMENT MACHINES [Note -- this discussion is largely academic. Many readers should just skip down to "NONBINARY MODULI" at this point.] Although one's complement machines are becoming rarer and rarer, it seems worth spending a little time examining how they relate to wraparound types. One goal of providing wraparound types is to be able to match the "typical" hardware support for "unsigned" types. For 2's complement machines, the "unsigned" operations generally correspond to wraparound versions of the "signed" operations, with the wraparound at 2**N, where N is the number of bits in the word. For 1's complement machines, the arithmetic operations actually wraparound modulo (2**N)-1, where N is the number of bits (N=36 for the Unisys 1100/2200 line of machines). Nevertheless, there are still 2**N different bit patterns -- both all-bits-on and all-bits-off correspond arithmetically to zero. Given that we have also provided bit-wise "and", "or", "xor", and "not" for wraparound types, 1's complement machines create a quandary. Clearly all-bits-on and all-bits-off behave completely differently with respect to the bit-wise logical operators, whereas they are pretty much indistinguishable with respect to the arithmetic operators. One way to look at the issue is in terms of T'Last vs. T'Modulus for a wraparound type. Normally, T'Last = T'Modulus-1. However, for a 1's complement machine, the "full word" wraparound type would have T'Last = T'Modulus = 2**N - 1. As you might expect, when used with an arithmetic operator, T'Last for such a type behaves much like zero because it is equal to T'Modulus. If we were to support a wraparound type where T'Last >= T'Modulus, we have to decide what happens with the various categories of operations: a) arithmetic (+, -, *, /) b) relational (both ordering and equality) c) bit-wise logical. d) membership in a range Clearly we want the bit-wise logical operators to operate on the bits, and not wraparound. Clearly we want the arithmetic operators to wraparound in some way, though it is not obvious whether wraparound should happen when a result is produced (meaning that the result of an arithmetic operator is always in 0..T'Modulus-1), or instead wraparound happens only as necessary to keep the result in 0..T'Last. Unfortunately, one's complement hardware differs to some degree in when "minus zero"s can be produced. This implies that we might do best to leave all of this implementation-defined. However, if we do feel the need to specify a rule, the most consistent rule seems to be that the only time the modulus matters is when the high bound (T'Last) is overflowed by an arithmetic operator (or 'Succ), at which point the "wraparound" occurs by subtracting the modulus. Alternatively, if subtraction (or 'Pred) underflows, then the modulus is added in to produce the wraparound. This implies that on addition, positive zero is never produced (except for +zero + +zero), or on subtraction, negative zero is never produced (except for -zero - +zero). Unary minus, as in "-X", would be considered equivalent to -zero - X, and so "- +zero" would produce -zero, whereas "- -zero" would produce +zero. Multiplication would presumably follow the rules for addition, and only produce +zero as a result when one of the operands is +zero. Note that in the above, although we are referring to all-ones as "-zero", we are presuming that for such an unsigned type, the relational operators would treat "-zero" as the largest value of the unsigned type (i.e. T'Last), and hence greater than all other values. It just happens that due to wraparound, adding T'Last is generally equivalent to adding zero, and subtracting T'Last is generally equivalent to subtracting zero. But that doesn't change the fact that T'Last compares greater than all other values. To summarize, if we want to accommodate full-word 1's complement unsigned types, then we have a situation where T'Last equals T'Modulus, instead of equaling T'Modulus-1. The most consistent rule seems to be to ignore the wraparound until the normal (non-wraparound) arithmetic result would exceed T'Last, or be less than zero, at which point the result would be canonicalized by adding or subtracting T'Modulus to bring the value back into range. We could instead leave the rule for "canonicalizing" the result of arithmetic operators implementation-defined. In any case, it seems that the (unsigned) relational, logical, and membership operations should perform no wraparound, and treat T'Last (all ones) as the largest value of the type (and hence *not* equal to +zero). NONBINARY MODULI Although 1's complement machines introduce the possibility of nonbinary moduli for fullword wraparound types, we consider 1's complement machines sufficiently rare to ignore for the purposes of deciding whether it is appropriate to provide general support for nonbinary moduli. Any proposal for wraparound types will probably need some special case to deal with the T'Last >= T'Modulus case introduced by 1's complement machines, so there seems no point in using support for 1's complement machines as any sort of evaluation criteria for a given wraparound-type proposal. Hence, in the following discussion, we will make no further mention of 1's complement machines, with the faith that if it is deemed necessary to support them, they can be accommodated in any proposal with an appropriate set of "Implementation Permissions." There seem to be various issues associated with whether Ada 9X allows nonbinary moduli for wraparound types: i) Implementation efficiency and the "WYSIWYG" principle; ii) Definition of bit-wise logical operators; iii) Usefulness, elegance, alternative syntaxes, etc. Implementation Issues, WYSIWYG, etc. ----------------------------------- The WYSIWYG (what-you-see-is-what-you-get) principle as it relates to implementation efficiency is that a language designer should not bury an inherently expensive operation inside a language operation that "looks" inexpensive. There is some concern that the arithmetic operations of a wraparound type will violate this WYSIWYG principle if the modulus of the type is not a power of 2 (what we will call a "nonbinary modulus"). However, as attested by certain implementors (93-3472.a and 94-3630.a), addition and subtraction do not really suffer from this problem, since the necessary wraparound can be performed by a conditional subtraction or addition of the modulus, rather than a full divide-with-remainder operation. This leaves multiplication, division, and exponentiation. Division imposes no overhead, since the result is guaranteed not to wraparound. Exponentiation is generally implemented as a series of squarings and multiplications, so this reduces to multiplication. So the fundamental WYSIWYG issue is that multiplication of a wraparound type with a nonbinary moduli may involve both a multiplication and a division (though probably some clever algorithm exists that uses multiplication by a precomputed inverse followed by a fixup instead of division). However, this is exactly analogous to the situation with fixed-point types, where nonbinary "smalls" may result in multiplication requiring a division, whereas binary "smalls" requires only multiplication and shifts. Note also that exponentiation for floating point involves a final divide if the exponent is negative. Hence, there is precedent for the "*" (and the "**") operator already including a final division as part of their dynamic semantics, depending on the characteristics of the type (nonbinary smalls) or the values of the operands (negative exponent). Furthermore, note that this must be traded off against an alternative where the programmer is forced to put in explicit "mod"s, which might result in poorer code quality for the presumably more common addition and subtraction operations, since the ranges of the operands are no longer guaranteed to be 0..modulus-1. Finally, we must recognize that the "mod" operator already has significantly different performance characteristics, depending on the value of the right operand. If the right operand is a power of 2, then the mod operation reduces to a simple "and". If the right operand is not a power of 2, then "mod" generally requires a divide-with-remainder, and possibly a fixup as well to deal properly with signs (this latter fixup is not an issue for unsigned types). Hence, we conclude that there is a bit of a WYSIWYG issue w.r.t. the "*" operator for nonbinary-modulus wraparound types, but that it is consistent with the level of variability in efficiency already associated with "*" (and other multiplying operators) for existing types. Bit-Wise Logical Operators -------------------------- The presumed purposes for using bit-wise logical operators with wraparound types that have a nonbinary modulus include the following: a) "and" -- conditional subtract (i.e. clear a bit if set) b) "or" -- conditional add (i.e. set a bit if clear) c) "xor" -- performing some kind of hash d) "not" -- "modular" complement Given these presumptions, it seems appropriate that these operations wraparound if the result is greater than T'Last (or in the case of "not", that the result be defined by "not X = T'Last - X"). [Note: By specifying T'Last rather than T'Modulus-1 as the overflow point for wraparound, and in the definition of "not", we accommodate the 1's complement case discussed in the previous section where T'Last = T'Modulus rather than T'Modulus-1, and ensure that "not" does a bit-wise complement for a fullword 1's complement wraparound type.] Usefulness, Elegance, Alternative Syntaxes, Etc... --------------------------- Some of the uses of non-binary modular types are discussed in 93-3429.a and 94-3613.a. LSN-1055 illustrates a usage in a subtract-with-borrow Fibonacci random number generator. In this example, the algorithm uses a cyclic array of floating point values, and the indices into the array are of a wraparound type. Here, the choice of generator determines the modulus for the wraparound, and there is no particular expectation that it would be a power of 2 (typical values for the modulus are 25, 28, etc.). In this example, the wraparound indices are simply decremented, so there is no need for a divide-with-remainder at run-time, only a conditional add. One of the features of Ada numerics is that the characteristics of the type are determined by the requirements of the particular algorithm. In this sense, it seems natural to accommodate non-binary moduli when that is what the algorithm implies. When considering imposing a restriction against nonbinary moduli, we could either leave the syntax as is: type T is mod ; and simply impose an "arbitrary" restriction that the value of the be a power of 2. Currently, we allow an implementation to impose a different upper bound on nonbinary moduli and binary moduli, so this is equivalent to requiring the upper bound on nonbinary moduli to be zero for all implementations. It is a bit uncomfortable introducing a feature using a preexisting reserved word such as "mod" that is general, in that it supports any integer value as a modulus in a "mod" operation, and then imposing what appears to be an arbitrary restriction that in the new usage context, the modulus must be a power of 2. Hence, we have considered alternative syntaxes that do not have the same appearance of imposing an arbitrary restriction: type T is bits ; or type T is unsigned ; Both of these syntaxes would define an unsigned, wraparound type, with range 0..2**-1. Except in certain full-word 1's complement situations, the modulus would be 2**. [For full-word 1's complement situations (that is when the is, say, 36 on a 1's complement, 36-bit word machine), presumably the modulus would be, say, 2**36-1, even though all-bits-on would be producable as the result of a predefined operator.] There are a few uncomfortable aspects to these alternative syntaxes. Both imply the addition of a new reserved word (either "bits" or "unsigned"). It seems unfortunate to have to add a reserved word as part of reducing user-level functionality. Neither reserved word conveys the complete story. The reserved word "bits" implies a certain number of bits for bit-wise operations, but does not convey that the type is signed or unsigned, wraparound or overflowing, nor even that the type is numeric. On the other hand, "unsigned" conveys that the type is unsigned (and presumably numeric), but not that it wraps around rather than overflows. It is also not obvious that the number after "unsigned" is a count of bits -- it might be a count of bytes (similar to the old Fortran "Integer*4"). Finally, there is an unpleasant imbalance between the declaration for overflowing signed types, and that for wraparound unsigned types using the alternative syntaxes. This can best be seen by comparing the three considered syntaxes for wraparound types against the signed integer syntax: type Signed_Byte is range -128 .. 127; type Unsigned_Byte is mod 256; type Signed_Byte is range -128 .. 127; type Unsigned_Byte is bits 8; type Signed_Byte is range -128 .. 127; type Unsigned_Byte is unsigned 8; The signed integer type syntax defines the type by specifying values relevant to the type ('First and 'Last). The current wraparound-type syntax using "mod" remains in the "value" domain, by specifying the value that acts as the modulus for the type. On the other hand, the alternative syntaxes switch into the "bit count" domain, which is distinctly different from all other numeric type declarations, where bit counts appear only in representation clauses. Given the above, we see that there seems to be user value to allowing nonbinary moduli, that the WYSIWYG violations for multiplying operators are in the same order of magnitude as some other numeric types (e.g. fixed point with nonbinary smalls), and that defining a syntax that inherently disallows nonbinary moduli is somewhat awkward. Given this, we recommend that wraparound types follow the lead of fixed point types, where the syntax and the semantics allows generality in the use of the feature, but that implementations may impose different restrictions on the binary vs. nonbinary cases to ease implementations and ensure acceptable efficiency. OVERALL MODEL The model for wraparound types in RM9X;4.0 is based on the concept of an equivalence class. Although this model is mathematically "correct," it seems more obscure than necessary. Furthermore, the preexisting "mod" operator of Ada 83 is defined in terms of returning a particular value, not just any value in an equivalence class. Hence, we propose that the model be changed so that it is deterministic. Wraparound for a wraparound type T would take place only as the final step of a predefined operator, and only if the result of the operator would otherwise be outside the range of 0 .. T'Last (normally T'Modulus-1). Wraparound consists of applying the "mod" operator to the result of the operator, using T'Modulus. [Note that certain predefined operators will never wraparound. In particular the "/" operator, the "not" operator (presuming "not X" is defined as T'Last - X), the "abs" operator, and the "and" operator will not wraparound. The "/" operator will still raise Constraint_Error if the right hand operand is zero.] As mentioned above, we recommend that wraparound *not* be performed as part of conversion. Instead, if the value is outside the range of the target subtype (prior to wraparound), Consraint_Error would be raised, just as for signed integer types. As part of this, there would never be wraparound on *input* to any operation. In particular, membership test (or range check) would not "normalize" its inputs. CONCLUSION Based on various comments and further analysis, we recommend that the model of wraparound types be changed somewhat: a) Conversion should not wraparound, but should raise Constraint_Error instead; b) Subranges should be supported for wraparound types, to support their use as array indices and in "for" loops. Furthermore, with conversion raising Constraint_Error instead of wrapping around, and with no wraparound on a membership test or range check, the semantics of a subrange check should be straightforward to understand. One other required change is for the constraint check on a null string to be based on whether the low bound is T'Base'First, rather than being based on whether the low bound has a "predecessor." c) 1's complement machines should be supported by allowing T'Base'Last to be T'Modulus, rather than T'Modulus-1, in certain circumstances -- in particular, for a full-word wraparound type on a 1's complement machine. This will allow the bit-wise operators to operate as appropriate, and let the 1's complement hardware be used directly. d) We recommend that nonbinary moduli continue to be allowed, with the current syntax. We also recommend that the implementation advice of Integer'Last as the Max_Nonbinary_Modulus be retained, to ensure that implementation remains straightforward. It is natural and useful to allow the generalization to nonbinary moduli, and the programmer is the best one to make the decision whether a binary or nonbinary modulus is appropriate to a given application. e) The overall model for wraparound types should be simplified. Wraparound should be defined as a final step of the various predefined operators (with 'Succ and 'Pred still defined in terms of +1 and -1). The "equivalence class" model (or principal/secondary value model) should be dropped. The term "principal" value itself can be dropped as well.