!topic LSN-1076 on Wraparound (i.e. modular/unsigned) types
!key LSN-1076 on Wraparound (i.e. modular/unsigned) types
!reference RM9X-3.5.4;4.0
!reference RM9X-4.6(28-33);4.0
!reference RM9X-4.2(9);4.0
!reference 93-3322.a Randy Brukardt 93-11-18 modular conversion
!reference 93-3324.a Norman Cohen 93-11-19 modular conversion
!reference 93-3325.a Tucker Taft 93-11-19 modular conversion
!reference 93-3327.a Norman Cohen 93-11-19 modular conversion
!reference 93-3426.a Randy Brukardt 93-12-08 modular not
!reference 93-3428.a Ted Baker 93-12-09 nonbinary modulus
!reference 93-3429.a Tucker Taft 93-12-09 nonbinary modulus
!reference 93-3431.a Tucker Taft 93-12-09 modular not
!reference 93-3450.a Randy Brukardt 93-12-10 modular not
!reference 93-3472.a Philippe Kruchten 93-12-16 nonbinary modulus
!reference 93-3476.a Norman Cohen 93-12-16 nonbinary modulus
!reference 93-3486.a Randy Brukardt 93-12-17 array indexing
!reference 94-3613.a Robert Eachus 94-01-05 nonbinary modulus
!reference 94-3620.a Norman Cohen 94-01-06 nonbinary modulus
!reference 94-3622.a Robert Eachus 94-01-06 nonbinary modulus
!reference 94-3625.a Norman Cohen 94-01-07 nonbinary modulus
!reference 94-3629.a Robert Eachus 94-01-07 nonbinary modulus
!reference 94-3630.a Randy Brukardt 94-01-07 nonbinary modulus
!reference 94-3661.d Swiss Delegation 94-01-13 subranges
!from Tucker Taft $Date: 94/02/14 18:18:37 $ $Revision: 1.3 $
!discussion
As illustrated by the set of references of this LSN, there have been
a number of comments relating to the rules for "modular" types,
including suggestions for changing their syntax and name. To
avoid prejudicing the discussion, I will refer in this LSN to
types with "wraparound" semantics as "wraparound" types.
This LSN covers the following issues:
a) conversion to/from other numeric types; static/dynamic
b) array indexing and subranges
c) 1's complement issues
d) nonbinary moduli
i) implementability, etc.
ii) definition of "not", "and", "or", "xor"
iii) usefulness/elegance/etc.
e) overall model (modular congruence vs. wraparound operators)
For those who like to peek ahead to the conclusion, this LSN
makes the following recommendations:
a) conversion to and/or from wraparound types should raise
Constraint_Error if the value, prior to any "wrapping around,"
is outside the range of the target subtype; that is, no
wraparound is performed as part of conversion.
b) the rules for null string literals should be altered to
specify that Constraint_Error is raised if the low bound
equals the low bound of the base range of the index type,
rather than talking about the "predecessor."
Subranges should still be supported, to allow wraparound
subtypes to be used for indexing into arrays.
c) 1's complement machines should be accommodated by giving
explicit permission to have a wraparound type where its
arithmetic operators wrap around at 2**N-1, even though the
type can represent 2**N distinct bitpatterns.
d) nonbinary moduli should be supported, retaining the current
advice of Integer'Last as the Max_Nonbinary_Modulus; as pointed
out in various messages, this is a useful generalization of
unsigned, and is consistent with Ada's other value-oriented
(rather than implementation-oriented) numeric models.
e) The model should be simplified, by expressing it in terms
of wraparound rules for specific arithmetic operations,
rather than in terms of equivalence classes based on
modular congruence; that is, the term "principal" value
would be dropped -- there would be no "secondary" values.
---------------------------------
CONVERSION RULES
In RM9X-4.6(28-30), wraparound types are handled specially with
respect to conversion. Wraparound is performed both on conversion
to a wraparound type, and conversion from a wraparound type to
a signed integer type. Note that wraparound is not performed
on conversion to or from a floating point type (RM9X-4.6(32-33),
nor on implicit conversion from a static universal integer (since
RM9X-4.6(36) says it is illegal to be outside the base range of
the expected type).
Several comments have indicated discomfort with the rule of wrapping
around on conversion from a wraparound type to a signed type, particularly
because whether to wraparound is determined by the target *sub*type,
rather than the target type. The rationale for the current rule
is given in 93-3325.a and also in AARM-4.6(30.a-30.c);4.0.
Given the discomfort with the current rule, it seems reasonable
to consider changing the rule to require Constraint_Error when
going from wraparound to signed if the value (prior to any adjustment)
is outside the target subtype. However, it seems inconsistent
(to the MRT) to happily convert -32768 to +32768 when going
from signed to wraparound (mod 2**16), and then
not convert back to -32768 when going the other
way if the target signed subtype is only 16 bits.
Hence, we would recommend that if Constraint_Error is expected
going one way, it should also be raised going the other way.
Furthermore, in the December DR/XRG meeting, the minutes
report that "the general model of type conversion has always been that
the converted value was the same as the resulting value after the
expected necessary rounding/sliding." This "general model" would
seem to apply to both directions, if it applies to either direction
of conversion between wraparound and signed.
We therefore recommend making the rule symmetrical, namely Constraint_Error
is raised on conversion to and/or from a wraparound type based on whether
the value being converted is in the range of the target subtype.
No adjustment (other than rounding if converting from real) is performed
prior to the check. This preserves the nice property that on
conversion of an IN OUT parameter, if Constraint_Error is not
raised going "in" to the procedure, and the procedure doesn't
update the parameter, one can reasonably expect no Constraint_Error
on the way out (real<->integer conversion of an IN OUT parameter
could still break this rule in some weird edge cases due to rounding).
Unchecked conversion, or the explicit use of "mod" would be
necessary to do a "conversion" that does wraparound.
ARRAY INDEXING AND SUBRANGES
Randy Brukardt in 93-3486.a points out a couple of problems
relating to indexing arrays with wraparound types:
i) null slices where the low bound = the Wraparound_Type'First.
ii) null string literals where the low bound = Wraparound_Type'First.
As an example of (i):
type Nibble is .
type Nibble_String is array(Nibble range <>) of Character;
Nibble_Names : Nibble_String(Nibble) :=
('0', '1', '2', '3', '4', '5', '6', '7',
'8', '9', 'A', 'B', 'C', 'D', 'E', 'F');
X_Len : constant := ... complicated expression ...;
X : constant Nibble_String :=
Nibble_Names(Nibble'First .. Nibble'First + X_Len - 1);
-- ***** If X_Len = 0, then X'Last is (-1) mod 2**4 = 15
-- ***** and hence X'Length = 16, not 0!
As an example of (ii):
if X = "" then ...
-- **** What is high bound of null string literal? -1 = 15 mod 16?
Problem (ii) is relatively easily "fixed" by saying that the evaluation
of a null string literal raises Constraint_Error if the low bound
of the literal is equal to the low bound of the base range of the
index type. Currently the rule in 4.2(10) says that a check is made
that the low bound "has a predecessor in the base range of the index type."
This is easily changed to say a check is made that the low bound is
not equal to the low bound of the base range of the index type.
Of course, the user will now get a Constraint_Error (and hopefully
a warning at compile-time), rather than a null string literal.
Problem (i) is not easily fixed, and we don't propose to do so.
Perhaps a NOTE in the RM warning of this problem would be useful.
There are actually two possible outcomes. If the bounds of the array
go from the low bound to the high bound of the index type (as in
this example), then no Constraint_Error is raised, but the slice
results in the whole array, rather than a null slice.
The other possible outcome is when the index subtype does not cover
the whole base range of the index type, in which case Constraint_Error
would be raised since index_subtype'First-1 would not be in
index_subtype'Range, even if it wraps around.
Since null slices are relatively rare, and arrays whose index covers
the full base range are also rare, we expect a null slice of such
an array to be exceedingly rare. Conceivably a very friendly compiler
could produce a warning when something like this might occur.
In any case, we don't see any viable solution to problem (i), so
we have to hope it will be rare!
The Swiss delegation wondered whether range constraints on wraparound
types were sufficiently useful to allow them. However, we certainly
expect users to use wraparound types for array indexing and for-loops.
Both of these contexts presume the use of range constraints to define
the index subtype of the array (or a slice thereof), and the
range of the for-loop. Disallowing named subtypes of wraparound types
is possible, but that would preclude the use of named subtypes to define
the index subtype of an array indexed by a wraparound type.
Hence, we recommend that range constraints be allowed for wraparound types.
Our expectation is that by simplifying the conversion rules (see above) and
the general "wraparound" model (see below), the underlying desire for
simplification will be achieved.
ONE'S COMPLEMENT MACHINES
[Note -- this discussion is largely academic. Many readers should
just skip down to "NONBINARY MODULI" at this point.]
Although one's complement machines are becoming rarer and rarer,
it seems worth spending a little time examining how they relate
to wraparound types. One goal of providing wraparound types
is to be able to match the "typical" hardware support for "unsigned"
types. For 2's complement machines, the "unsigned" operations generally
correspond to wraparound versions of the "signed" operations, with the
wraparound at 2**N, where N is the number of bits in the word.
For 1's complement machines, the arithmetic operations actually wraparound
modulo (2**N)-1, where N is the number of bits (N=36 for the Unisys
1100/2200 line of machines). Nevertheless, there are still 2**N different
bit patterns -- both all-bits-on and all-bits-off correspond arithmetically
to zero. Given that we have also provided bit-wise "and", "or", "xor",
and "not" for wraparound types, 1's complement machines create a quandary.
Clearly all-bits-on and all-bits-off behave completely differently with
respect to the bit-wise logical operators, whereas they are pretty much
indistinguishable with respect to the arithmetic operators.
One way to look at the issue is in terms of T'Last vs. T'Modulus for
a wraparound type. Normally, T'Last = T'Modulus-1. However, for
a 1's complement machine, the "full word" wraparound type would
have T'Last = T'Modulus = 2**N - 1. As you might expect, when
used with an arithmetic operator, T'Last for such a type behaves much
like zero because it is equal to T'Modulus.
If we were to support a wraparound type where T'Last >= T'Modulus, we
have to decide what happens with the various categories of operations:
a) arithmetic (+, -, *, /)
b) relational (both ordering and equality)
c) bit-wise logical.
d) membership in a range
Clearly we want the bit-wise logical operators to operate on the bits,
and not wraparound. Clearly we want the arithmetic operators to
wraparound in some way, though it is not obvious whether wraparound
should happen when a result is produced (meaning that the result of
an arithmetic operator is always in 0..T'Modulus-1), or instead
wraparound happens only as necessary to keep the result in 0..T'Last.
Unfortunately, one's complement hardware differs to some degree
in when "minus zero"s can be produced. This implies that we
might do best to leave all of this implementation-defined. However,
if we do feel the need to specify a rule, the most consistent
rule seems to be that the only time the modulus matters
is when the high bound (T'Last) is overflowed by an arithmetic
operator (or 'Succ), at which point the "wraparound" occurs
by subtracting the modulus. Alternatively, if subtraction (or 'Pred)
underflows, then the modulus is added in to produce the
wraparound. This implies that on addition, positive zero is
never produced (except for +zero + +zero), or on subtraction,
negative zero is never produced (except for -zero - +zero).
Unary minus, as in "-X", would be considered equivalent to -zero - X,
and so "- +zero" would produce -zero, whereas "- -zero" would produce +zero.
Multiplication would presumably follow the rules for addition, and
only produce +zero as a result when one of the operands is +zero.
Note that in the above, although we are referring to all-ones as "-zero",
we are presuming that for such an unsigned type, the relational operators
would treat "-zero" as the largest value of the unsigned type (i.e.
T'Last), and hence greater than all other values. It just happens
that due to wraparound, adding T'Last is generally equivalent to
adding zero, and subtracting T'Last is generally equivalent to subtracting
zero. But that doesn't change the fact that T'Last compares greater
than all other values.
To summarize, if we want to accommodate full-word 1's complement
unsigned types, then we have a situation where T'Last equals T'Modulus,
instead of equaling T'Modulus-1. The most consistent rule seems to
be to ignore the wraparound until the normal (non-wraparound) arithmetic
result would exceed T'Last, or be less than zero, at which point the
result would be canonicalized by adding or subtracting T'Modulus
to bring the value back into range. We could instead leave the rule for
"canonicalizing" the result of arithmetic operators implementation-defined.
In any case, it seems that the (unsigned) relational, logical, and membership
operations should perform no wraparound, and treat T'Last (all ones)
as the largest value of the type (and hence *not* equal to +zero).
NONBINARY MODULI
Although 1's complement machines introduce the possibility of nonbinary
moduli for fullword wraparound types, we consider 1's complement
machines sufficiently rare to ignore for the purposes of deciding whether
it is appropriate to provide general support for nonbinary moduli.
Any proposal for wraparound types will probably need some special case
to deal with the T'Last >= T'Modulus case introduced by 1's complement
machines, so there seems no point in using support for 1's complement
machines as any sort of evaluation criteria for a given wraparound-type
proposal. Hence, in the following discussion, we will make no further
mention of 1's complement machines, with the faith that if it is deemed
necessary to support them, they can be accommodated in any proposal with
an appropriate set of "Implementation Permissions."
There seem to be various issues associated with whether
Ada 9X allows nonbinary moduli for wraparound types:
i) Implementation efficiency and the "WYSIWYG" principle;
ii) Definition of bit-wise logical operators;
iii) Usefulness, elegance, alternative syntaxes, etc.
Implementation Issues, WYSIWYG, etc.
-----------------------------------
The WYSIWYG (what-you-see-is-what-you-get) principle as it relates
to implementation efficiency is that a language designer should not
bury an inherently expensive operation inside a language operation
that "looks" inexpensive. There is some concern that the arithmetic
operations of a wraparound type will violate this WYSIWYG principle
if the modulus of the type is not a power of 2 (what we will call a
"nonbinary modulus"). However, as attested by certain implementors
(93-3472.a and 94-3630.a), addition and subtraction do not really suffer
from this problem, since the necessary wraparound can be performed
by a conditional subtraction or addition of the modulus, rather than
a full divide-with-remainder operation. This leaves multiplication,
division, and exponentiation. Division imposes no overhead, since the
result is guaranteed not to wraparound. Exponentiation is generally
implemented as a series of squarings and multiplications, so this
reduces to multiplication.
So the fundamental WYSIWYG issue is that multiplication
of a wraparound type with a nonbinary moduli may involve
both a multiplication and a division (though probably some clever
algorithm exists that uses multiplication by a precomputed inverse followed
by a fixup instead of division). However, this is exactly analogous
to the situation with fixed-point types, where nonbinary "smalls" may
result in multiplication requiring a division, whereas binary "smalls"
requires only multiplication and shifts. Note also that exponentiation
for floating point involves a final divide if the exponent is negative.
Hence, there is precedent for the "*" (and the "**") operator already
including a final division as part of their dynamic semantics, depending
on the characteristics of the type (nonbinary smalls) or the values
of the operands (negative exponent).
Furthermore, note that this must be traded off against an alternative
where the programmer is forced to put in explicit "mod"s, which might
result in poorer code quality for the presumably more common addition
and subtraction operations, since the ranges of the operands are no
longer guaranteed to be 0..modulus-1.
Finally, we must recognize that the "mod" operator already has significantly
different performance characteristics, depending on the value of the
right operand. If the right operand is a power of 2, then the mod
operation reduces to a simple "and". If the right operand is not a power
of 2, then "mod" generally requires a divide-with-remainder, and possibly
a fixup as well to deal properly with signs (this latter fixup
is not an issue for unsigned types).
Hence, we conclude that there is a bit of a WYSIWYG issue w.r.t. the
"*" operator for nonbinary-modulus wraparound types, but that it is
consistent with the level of variability in efficiency already associated
with "*" (and other multiplying operators) for existing types.
Bit-Wise Logical Operators
--------------------------
The presumed purposes for using bit-wise logical operators
with wraparound types that have a nonbinary modulus include
the following:
a) "and" -- conditional subtract (i.e. clear a bit if set)
b) "or" -- conditional add (i.e. set a bit if clear)
c) "xor" -- performing some kind of hash
d) "not" -- "modular" complement
Given these presumptions, it seems appropriate that these operations
wraparound if the result is greater than T'Last (or in the case
of "not", that the result be defined by "not X = T'Last - X").
[Note: By specifying T'Last rather than T'Modulus-1 as the overflow
point for wraparound, and in the definition of "not", we accommodate
the 1's complement case discussed in the previous section where
T'Last = T'Modulus rather than T'Modulus-1, and ensure that
"not" does a bit-wise complement for a fullword 1's complement
wraparound type.]
Usefulness, Elegance, Alternative Syntaxes, Etc...
---------------------------
Some of the uses of non-binary modular types are discussed
in 93-3429.a and 94-3613.a.
LSN-1055 illustrates a usage in a subtract-with-borrow
Fibonacci random number generator. In this example,
the algorithm uses a cyclic array of floating point
values, and the indices into the array are of a wraparound type.
Here, the choice of generator determines the modulus for the
wraparound, and there is no particular expectation that
it would be a power of 2 (typical values for the modulus
are 25, 28, etc.). In this example, the wraparound indices
are simply decremented, so there is no need for a divide-with-remainder
at run-time, only a conditional add.
One of the features of Ada numerics is that the characteristics
of the type are determined by the requirements of the particular
algorithm. In this sense, it seems natural to accommodate
non-binary moduli when that is what the algorithm implies.
When considering imposing a restriction against nonbinary moduli,
we could either leave the syntax as is:
type T is mod ;
and simply impose an "arbitrary" restriction that the value
of the be a power of 2. Currently, we allow an implementation
to impose a different upper bound on nonbinary moduli and binary moduli,
so this is equivalent to requiring the upper bound on nonbinary
moduli to be zero for all implementations.
It is a bit uncomfortable introducing a feature using a preexisting
reserved word such as "mod" that is general, in that it supports
any integer value as a modulus in a "mod" operation, and then imposing
what appears to be an arbitrary restriction that in the new
usage context, the modulus must be a power of 2. Hence, we have
considered alternative syntaxes that do not have the same appearance
of imposing an arbitrary restriction:
type T is bits ;
or
type T is unsigned ;
Both of these syntaxes would define an unsigned, wraparound type,
with range 0..2**-1. Except in certain full-word
1's complement situations, the modulus would be 2**.
[For full-word 1's complement situations (that is when the
is, say, 36 on a 1's complement, 36-bit word machine), presumably
the modulus would be, say, 2**36-1, even though all-bits-on would be
producable as the result of a predefined operator.]
There are a few uncomfortable aspects to these alternative syntaxes.
Both imply the addition of a new reserved word (either "bits" or
"unsigned"). It seems unfortunate to have to add a reserved word as part
of reducing user-level functionality. Neither reserved word conveys
the complete story. The reserved word "bits" implies a certain number of bits
for bit-wise operations, but does not convey that the type is
signed or unsigned, wraparound or overflowing, nor even that the
type is numeric. On the other hand, "unsigned" conveys that the
type is unsigned (and presumably numeric), but not that it wraps
around rather than overflows. It is also not obvious that
the number after "unsigned" is a count of bits -- it might be
a count of bytes (similar to the old Fortran "Integer*4").
Finally, there is an unpleasant imbalance between the declaration
for overflowing signed types, and that for wraparound unsigned types using
the alternative syntaxes. This can best be seen by comparing the
three considered syntaxes for wraparound types against the
signed integer syntax:
type Signed_Byte is range -128 .. 127; type Unsigned_Byte is mod 256;
type Signed_Byte is range -128 .. 127; type Unsigned_Byte is bits 8;
type Signed_Byte is range -128 .. 127; type Unsigned_Byte is unsigned 8;
The signed integer type syntax defines the type by specifying values
relevant to the type ('First and 'Last).
The current wraparound-type syntax using "mod" remains in
the "value" domain, by specifying the value that acts as the
modulus for the type. On the other hand, the alternative
syntaxes switch into the "bit count" domain, which
is distinctly different from all other numeric type declarations,
where bit counts appear only in representation clauses.
Given the above, we see that there seems to be user value to allowing
nonbinary moduli, that the WYSIWYG violations for multiplying operators
are in the same order of magnitude as some other numeric types
(e.g. fixed point with nonbinary smalls), and that defining a syntax
that inherently disallows nonbinary moduli is somewhat awkward.
Given this, we recommend that wraparound types follow the lead
of fixed point types, where the syntax and the semantics allows
generality in the use of the feature, but that implementations
may impose different restrictions on the binary vs. nonbinary
cases to ease implementations and ensure acceptable efficiency.
OVERALL MODEL
The model for wraparound types in RM9X;4.0 is based on the
concept of an equivalence class. Although this model is
mathematically "correct," it seems more obscure than necessary.
Furthermore, the preexisting "mod" operator of Ada 83 is
defined in terms of returning a particular value, not just
any value in an equivalence class.
Hence, we propose that the model be changed so that it is
deterministic. Wraparound for a wraparound type T would take place only as
the final step of a predefined operator, and only if the result of the
operator would otherwise be outside the range of 0 .. T'Last
(normally T'Modulus-1). Wraparound consists of applying the "mod"
operator to the result of the operator, using T'Modulus.
[Note that certain predefined operators will never
wraparound. In particular the "/" operator, the "not" operator
(presuming "not X" is defined as T'Last - X), the "abs" operator,
and the "and" operator will not wraparound. The "/" operator will
still raise Constraint_Error if the right hand operand is zero.]
As mentioned above, we recommend that wraparound *not* be performed
as part of conversion. Instead, if the value is outside the range
of the target subtype (prior to wraparound), Consraint_Error would be
raised, just as for signed integer types. As part of this, there
would never be wraparound on *input* to any operation. In particular,
membership test (or range check) would not "normalize" its inputs.
CONCLUSION
Based on various comments and further analysis, we recommend
that the model of wraparound types be changed somewhat:
a) Conversion should not wraparound, but should raise Constraint_Error
instead;
b) Subranges should be supported for wraparound types, to support their
use as array indices and in "for" loops. Furthermore, with conversion
raising Constraint_Error instead of wrapping around, and with
no wraparound on a membership test or range check, the semantics of a
subrange check should be straightforward to understand.
One other required change is for the constraint check
on a null string to be based on whether the low bound is
T'Base'First, rather than being based on whether the low bound
has a "predecessor."
c) 1's complement machines should be supported by allowing T'Base'Last
to be T'Modulus, rather than T'Modulus-1, in certain circumstances --
in particular, for a full-word wraparound type on a 1's complement
machine. This will allow the bit-wise operators to operate as
appropriate, and let the 1's complement hardware be used directly.
d) We recommend that nonbinary moduli continue to be allowed, with
the current syntax. We also recommend that the implementation
advice of Integer'Last as the Max_Nonbinary_Modulus be retained,
to ensure that implementation remains straightforward.
It is natural and useful to allow the generalization to nonbinary moduli,
and the programmer is the best one to make the decision whether
a binary or nonbinary modulus is appropriate to a given application.
e) The overall model for wraparound types should be simplified.
Wraparound should be defined as a final step of the various predefined
operators (with 'Succ and 'Pred still defined in terms of +1 and -1).
The "equivalence class" model (or principal/secondary value model)
should be dropped. The term "principal" value itself can be dropped
as well.