Accuracy of type conversions from real types to integers AI-00601/00 1
88-11-08 BI RE
!standard 04.06 (07) 88-11-08 AI-00601/00
!class binding interpretation 88-11-08
!status received 88-11-08
!topic Accuracy of type conversions from real types to integers
!summary 88-11-08
!question 88-11-08
!recommendation 88-11-08
!discussion 88-11-08
!appendix 88-10-26
*****************************************************************************
!section 04.06 (07) Daniel Stock/R.R. Software 88-10-26 83-01029
!version 1983
!topic Accuracy of type conversions from real types to integers
Section 4.6(7) of the LRM says that "for conversions involving real types,
the result is within the accuracy of the specified subtype (see 4.5.7). The
conversion of a real value to an integer type rounds to the nearest integer;
if the operand is halfway between two integers (within the accuracy of the
real subtype) rounding may be up or down."
It is not clear to me what "The accuracy of the real subtype" means when
converting from a real type to an integer type, especially when converting
from a fixed-point type has a small value that is not a power of two.
Section 4.5.7 does not really help: paragraphs 1 and 2 just define terms;
paragraphs 3 to 9 are limited to when the result of an operation has a
real subtype; and paragraphs 10 to 11 (and the AI's thereon) are limited to
relations and membership tests.
In particular, when performing a type conversion to an integer type from a
value that is a model number of a fixed point type, must the conversion
yield the nearest integer? One might think so, by analogy to the way that
real types are treated. But consider the following example.
NASTY: constant := 0.316666...66667; --add as many sixes as you like
MAX_MULT: constant := 2 ** SYSTEM.MAX_MANTISSA - 1;
type FIX_TYPE is delta NASTY range -MAX_MULT*NASTY..MAX_MULT*NASTY;
for FIX_TYPE'SMALL use NASTY;
SAMPLE: INTEGER;
function IDENT_FIX (ITEM : FIX_TYPE) return FIX_TYPE; -- unoptimizable
-- function that always returns its argument
...
SAMPLE := INTEGER (IDENT_FIX (30 * FIX_TYPE'(FIX_TYPE'SMALL)));
Accuracy of type conversions from real types to integers AI-00601/00 2
88-11-08 BI RE
-- Must this be 10?
-- It is the conversion of a model number of a fixed point type
-- to an integer, where the exact multiplication yields
-- 9.50000...00001, with as many zeroes as there were sixes in
-- the declaration of NASTY.
If SAMPLE must be assigned the value ten, then it would appear that
arbitrary precision is needed at run time to perform such a type conversion,
which is absurd. (There are other equally absurd alternatives, such as
rejecting vast numbers of length clauses to avoid the problem or performing
an analysis of the continued fractions of any small values to attempt to
anticipate the problem.) To me, this means that the ARG should specify that
type conversion of model numbers of fixed point types need not yield the
nearest integer. A reasonable interpretation might be that for a conversion
of a value within a model interval I (possibly a single model number) to an
integer type, the result must be within
(0.5 + 0.5 * T'Small)
of some value in I. A more stringent rule might apply to the less
interesting case of conversions of floating point numbers to integers, since
it is generally easy for most machines to do the right thing in that case.
*****************************************************************************
!section 04.06 (07) Daniel Stock/R.R. Software 88-10-26 83-01030
!version 1983
!topic Accuracy of conversions from fixed point types to floating point types
The required accuracy of type conversions from fixed point types to floating
point types can lead to surprising results. Consider the following example:
DIVISOR : constant := 7;
STEP : constant :=1.0 / DIVISOR;
MAX_MULT : constant := 2 ** SYSTEM.MAX_MANTISSA - 1;
type FIXED_TYPE is delta STEP range -MAX_MULT*STEP..MAX_MULT*STEP;
for FIXED_TYPE'SMALL use STEP;
EXAMPLE: FIXED_TYPE;
function IDENT_FIXED (ITEM : FIXED_TYPE) return FIXED_TYPE;
-- unoptimizable function that always returns its argument
...
EXAMPLE := IDENT_FIXED (FIXED_TYPE'(FIXED_TYPE'SMALL));
if 0.0 /= FLOAT'SAFE_LARGE * (1.0 - FLOAT (DIVISOR * EXAMPLE)) then
FAIL; -- Can this procedure be called? Apparently not.
end if;
In this example, DIVISOR * EXAMPLE is a model number of type FIXED_TYPE,
with the value exactly 1.0. Hence, upon conversion to type FLOAT, it
Accuracy of type conversions from real types to integers AI-00601/00 3
88-11-08 BI RE
must retain the exact value 1.0, so that the procedure FAIL cannot be called.
(The multiplication by FLOAT'SAFE_LARGE is just to prevent an implementation
from passing this test by having a "fuzzy" version of equality, which
appears to be illegal by A1-00174/05 anyway.)
At first blush, it might appear that this example also requires arbitrary
precision at run time, since one must essentially get 1.0 from multiplying
1.0/7.0 by 7. Fortunately, arbitrary precision is not needed: one needs
only a few extra bits of precision (which many machines have) when doing the
conversion from a fixed point type to a floating point type, together with a
routine that "fuzzes" the result to the nearest safe number of the floating
point type. The in-house version of the JANUS/Ada compiler does this, and
passes a battery of tests similar to this one. This seemed like a
reasonable approach when we were implementing small values that are not
powers of two. But is this the intent of the LRM? If so, I would like to
see the ARG confirm it (I also wonder how many validated compilers would pass
a test like this). It seems rather strange, in that potentially significant
bits of accuracy in the machine must be explicitly thrown away to satisfy
the numeric requirements of the language.