!topic LSN-1055 on Evolution of the Random Number Generator
!key LSN-1055 on Evolution of the Random Number Generator
!reference RM9X-C.5.2;4.0
!from Ken Dritz $Date: 94/01/31 12:14:43 $ $Revision: 1.4 $
!discussion
I try in this LSN to reflect the current state of the discussions on the
random number generator. I summarize in section 1 the current
specification, accounting for the evolutionary improvements that seem to
have been agreed to (or at least suggested and then not refuted by
anyone, including me). In section 2, I show two example
implementations. Then I discuss in section 3 the possible alternatives
to the specification in section 1, regarding the range of Random and the
presence or absence of an integer generator. Finally, I present, in
section 4, a test program for the random number generator, embodying the
craps tests.
1. Evolution of the Packaging of RNG Functionality
========= == === ========= == === =============
In RM9X;4.0, the declaration of Ada.Numerics.Random_Numbers was given as
follows:
package Ada.Numerics.Random_Numbers is
type Generator is limited private;
Number_Of_Seed_Components : constant := ;
type Seed is array (1 .. Number_Of_Seed_Components) of Integer;
Seed_Error : exception;
subtype Uniformly_Distributed is Float range 0.0 .. 1.0;
function Random (Gen : Generator) return Uniformly_Distributed;
procedure Get_Seed (Gen : in Generator;
Value : out Seed);
procedure Set_Seed (Gen : in Generator;
Value : in Seed);
procedure Reset (Gen : in Generator;
Initiator : in Integer);
procedure Reset (Gen : in Generator);
private
-- implementation defined
end Ada.Numerics.Random_Numbers;
This package declaration was, of course, supplemented by various semantic
rules, one of which was that Random "never delivers a result of 1.0."
The following evolutionary improvements in the packaging of this package's
functionality have been discussed, and I plan to make them:
1. The Seed type has become a private type and is now named State; the
declaration of Number_Of_Seed_Components is, of course, omitted
(93-3331.a). Furthermore, the State type is defined in a child package,
called Ada.Numerics.Random_Numbers.Extensions (93-3331.a, 93-3343.a),
which contains features relevant to advanced applications of the RNG
(i.e., those with requirements that go beyond what Fortran supports).
The child package is defined in the Numerics Annex (93-3334.a).
2. The names of the procedures Get_Seed and Set_Seed have been changed to
Save and (another overloading of) Reset (93-3331.a, 93-3332.a,
93-3334.a), and the procedures have been moved to the child package
(93-3343.a). Reset is the inverse of Save (93-3310,a). These
procedures serve only to save (i.e., checkpoint) the current generator
state and to reset a generator to a previously saved state, or to a
state constructed by some other means (and which is checked for validity
when it is constructed). Thus, the new overloading of Reset does not
have to validate its state parameter. The declaration of Seed_Error is,
of course, also omitted.
3. In addition to State, Save, and the new overloading of Reset, the child
package must define at least the following entities:
a. An Image function that maps each value of type State to a unique
(but implementation-defined) value of type String (93-3331.a).
b. A named number, called Max_Image_Width, that bounds the length of
the string returned by any invocation of Image.
c. A Value function such that Value(Image(A_State)) = A_State and Value
of any string that is not the image of some generator state (after
removing leading and trailing blanks) raises Constraint_Error.
(93-3331.a suggested that a unique exception defined in the child
package, analogous to Seed_Error, be raised. 93-3343.a gave that
exception the name State_Error, but it also suggested that raising
Numerics.Argument_Error might be appropriate. I now think that
raising Constraint_Error is the most appropriate.)
The child package may define other (implementation-defined) entities,
such as a procedure to change the alterable parameters (if any) of a
generator or an alternative to Value that might be more convenient to
use for the purpose of constructing an arbitrary, validated,
user-specified state (93-3331.a). (The primary purpose of Value is to
allow one to enter data denoting a state, perhaps interactively, after
having made a written note of such data, for example during interactive
debugging, when it might have been obtained using the Image function.)
I present below a rewrite of subclause C.5.2 that incorporates the foregoing
changes and that takes a default position with respect to the range of Random
and the presence or absence of an integer generator. (Alternatives that might
be selected instead are discussed in section 3.) The default position--i.e.
my current recommendation--is that an integer generator SHOULD be included, but
that the range of Random (the floating point generator) should NOT be reduced
to less than the full range of its return subtype. Thus, Random can
conceivably return 0.0 (this does not represent a change) or 1.0 (this does),
as well as a value different from 1.0 but sufficiently close to it to behave
like 1.0 in some contexts. The integer generator is called Random_Integer; it
is not generic, but takes two integer parameters, Low and High (with High >=
Low), and returns a value of type Integer in the range Low .. High. It takes a
parameter of the same generator type as Random. The reasons for these
recommendations are given in section 3.
C.5.2 Random Number Generation
Basic facilities for the generation of pseudo-random numbers are provided by
the package Numerics.Random_Numbers. These facilities include a limited
private type each of whose objects serves as the generator of a (possibly
distinct) sequence of random numbers; a pair of functions to obtain the
``next'' random number from a given sequence of random numbers (that is,
from its generator)--one of them returning a floating point number in the
range 0.0 .. 1.0 and the other returning an integer in a range specified by
the user; and procedures to initialize or reinitialize a given generator.
Additional facilities for advanced applications are provided, in
implementations supporting the Numerics Annex, by a child package
(see K.?.?).
Static Semantics
The library package Numerics.Random_Numbers has the following declaration:
package Ada.Numerics.Random_Numbers is
type Generator is limited private;
subtype Uniformly_Distributed is Float range 0.0 .. 1.0;
function Random (Gen : Generator) return Uniformly_Distributed;
function Random_Integer (Gen : Generator;
Low, High : Integer) return Integer;
procedure Reset (Gen : in Generator;
Initiator : in Integer);
procedure Reset (Gen : in Generator);
private
-- implementation defined
end Ada.Numerics.Random_Numbers;
An object of the limited private type Generator is associated with a
sequence of random numbers. Each generator has a hidden state, which it
uses to determine the current position in its associated sequence. All
generators are implicitly initialized to the same implementation-defined
state; they may also be explicitly initialized, or reinitialized, to a
time-dependent state or to a state uniquely denoted by an integer value.
The operations on generators affect the state and therefore the future
values of the associated sequence. The semantics of these operations are
defined below.
function Random (Gen : Generator) return Uniformly_Distributed;
function Random_Integer (Gen : Generator;
Low, High : Integer) return Integer;
Obtains the ``next'' random number from the sequence associated with the
given generator, relative to its current state, according to an
implementation-defined algorithm. The result of Random is delivered as a
value of the subtype Uniformly_Distributed, which is a subtype of the
predefined type Float having a range of 0.0 .. 1.0. The result of
Random_Integer is delivered as a value of the predefined type Integer in the
range Low .. High; Constraint_Error is raised if this is a null range. A
sufficiently long sequence of random numbers obtained by successive calls to
Random or Random_Integer is approximately uniformly distributed over its
range.
procedure Reset (Gen : in Generator;
Initiator : in Integer);
procedure Reset (Gen : in Generator);
Sets the state of the specified generator to a state which is an
implementation-dependent function of the value of the parameter Initiator
(or to a time-dependent state, if the parameter Initiator is not specified).
Implementation Requirements
Performance requirements for the random number generator, which apply only
in implementations conforming to the Numerics Annex, and then only in the
``strict'' mode defined there (see K.2), are given in K.2.5.
Documentation Requirements
No one algorithm for random number generation is best for all applications.
To enable the user to determine the suitability of the random number
generator for the intended application, the implementation has to describe
the algorithm used and has to give its period, if known exactly. or a lower
bound on the period, if the exact period is unknown.
The implementation also has to document the minimum time interval between
calls to Reset (without an Initiator parameter) which are guaranteed to
initiate different sequences.
Implementation Permissions
Examples of permissible algorithms for random number generation are:
o instances of classical linear congruential generators;
o combination generators (such as those of Wichmann and Hill or
L'Ecuyer);
o instances of the phenomenally long-period ``subtract-with-borrow'' or
``add-with-carry'' Fibonacci generators of Marsaglia and Zaman.
Implementation Advice
Any storage associated with an object of type Generator should be reclaimed
on exit from the scope of the object.
If the generator period is sufficiently long in relation to the number of
distinct Initiator values, then each possible value of Initiator passed to
Reset should initiate a sequence of random numbers that does not, in a
practical sense, overlap the sequence initiated by any other value. If this
is not possible, then the mapping between Initiator values and generator
states should be a rapidly varying function of the Initiator value.
NOTES
If two or more tasks are to share the same generator, then the tasks have
to synchronize their access to the generator as for any shared variable
(see 9.10).
Within a given implementation, a repeatable random number sequence can be
obtained by relying on the implicit initialization of generators or by
explicitly initializing a generator with a repeatable initiator value.
Different sequences of random numbers can be obtained from a given
generator in different program executions by explicitly initializing the
generator without an initiator value.
Whether Random can actually deliver any particular value in its range is
implementation dependent. Applications should assume that the bounds of
the range, or values sufficiently close to them to behave
indistinguishably from them, can occur. Exponentially distributed
(floating point) random numbers with mean and standard deviation 1.0 can
be obtained by evaluating either the expression -Log(Random(G)) or the
expression -Log(1.0 - Random(G)). Note that Log raises
Numerics.Argument_Error when its parameter is zero, so the expression
should be computed as the return expression of a function that has a
handler for Argument_Error; the handler should return Float'Last, or some
other large value appropriate to the application. The value 1.0
potentially poses problems for the conversion of the output of Random
into some integer range; when random integers are required,
Random_Integer should be used directly, because it avoids those potential
problems.
Supplementing the above material will be new material, in the Numerics Annex,
defining the child package. It would be appropriate to include that material
as clause K.2 and to renumber the current clause K.2 and those following as K.3
and higher. The new clause might read as follows.
K.2 Advanced Features of the Random Number Generator
Advanced facilities for the generation of pseudo-random numbers are provided
by the package Numerics.Random_Numbers.Extensions. These facilities are
concerned with saving and restoring the internal states of generators,
either in a direct form for checkpointing and restarting a generator or in a
string form (which may be convenient for interactive debugging of an
application involving random numbers).
Static Semantics
The library package Numerics.Random_Numbers.Extensions has the following
declaration:
package Ada.Numerics.Random_Numbers.Extensions is
type State is private;
procedure Save (Gen : in Generator;
To_State : out State);
procedure Reset (Gen : in Generator;
From_State : in State);
Max_Image_Width : constant := ;
function Image (Of_State : State) return String;
function Value (Coded_State : String) return State;
-- Other operations on generators and states, at the implementor's
-- discretion
private
-- implementation defined
end Ada.Numerics.Random_Numbers.Extensions;
A value of the private type State represents the internal state of a
generator. Operations on the State type can be used to obtain, restore, or
examine the state of a generator. The semantics of these operations are
defined below.
procedure Save (Gen : in Generator;
To_State : out State);
procedure Reset (Gen : in Generator;
From_State : in State);
Save obtains the current state of a generator. Reset gives a generator the
specified state. A generator that is reset to a state previously obtained
by invoking Save is restored to the state it had when Save was invoked.
function Image (Of_State : State) return String;
function Value (Coded_State : String) return State;
Image provides a representation of a state coded (in an
implementation-defined way) as a string whose length is bounded by the value
of Max_Image_Width. Value is the inverse of Image: Value(Image(S)) = S,
while taking the Value of a string that is not the image of any state raises
Constraint_Error.
Implementation Requirements
The mapping of states to strings performed by Image shall be one-to-one.
Documentation Requirements
Implementations shall document the mapping of states to strings and the
mapping of strings to states.
Implementation Permissions
Implementations may add to Numerics.Random_Numbers.Extensions other
facilities for manipulating states. A candidate for such a facility is an
alternative to Value that constructs a state from data presented in a more
convenient form than a string (for example, in the form of an array of
integers or floating point numbers).
By including parameters of the random number generation algorithm in its
state information, an implementation can provide its users with a
rudimentary way of altering the algorithm's parameters (e.g., for purposes
of experimentation), given only the subprograms defined above.
Nevertheless, implementations might wish to provide facilities (in the form
of additional subprograms) for more conveniently altering those parameters.
On the other hand, implementations may prefer not to offer any facility for
altering the parameters of the generation algorithm, since it is not
generally possible to guarantee properties of the algorithm given arbitrary
parameters.
I do not show in this LSN the updates for what is currently clause K.2.5, which
discusses the performance requirements for the random number generator. These
are straightforward.
2. Example Implementations
======= ===============
I promised quite some time ago to post example implementations that meet the
performance requirements of the Numerics Annex. Two are included in this
section, one for a linear congruential generator and one for a
subtract-with-borrow Fibonacci generator.
Please note that I developed and tested my original implementations in Ada 83.
The Ada 9X versions presented below have not been tested in Ada 9X and could
well contain errors related to the shift of environments (as well as to the
changes in packaging, which I've tried to accommodate but have not tested).
The underlying algorithms are the same, however. Also, Image and Value were
added to the implementation only recently and have not been tested at all.
The implementation notes below refer only to the behavior of Random. In all
cases, I have implemented Random_Integer in terms of Random. This was a late
addition, and in fact it is not at the moment fully conforming, since it can
raise Constraint_Error if the range Low .. High includes more than Integer'Last
values (i.e., if High - Low + 1 exceeds Integer'Last). Though such wide result
ranges probably do not occur in practice, it seems undesirable to legislate
against them. My implementation uses the clever expression provided by Robert
Eachus, but it does have the problem cited here with excessively wide ranges,
and I haven't had time to search for a circumvention of that problem.
2.1. Linear Congruential Generator
====== ============ =========
The venerable multiplicative linear congruential generator with a multiplier
of 16_807 (7**5), no additive constant, and a modulus of 2_147_483_647
(2**31-1) is presented below. This generator has a period of 2**31-2. It can
NOT generate the value 0.0 (an LCG with an additive constant COULD generate
0.0), but (as written) it CAN generate the value 1.0 if
2_147_483_646.0 / 2_147_483_647.0 rounds up to 1.0 when the result is coerced
to Float. Additional notes on modifying it to avoid generating 1.0 can be
found later in this section.
It is customary to perform integer arithmetic in this algorithm, using 32-bit
integers, and convert to floating point just before the end. In the customary
implementation, the state is an integer between 1 and 2**31-2. A new state is
obtained from the current state by integer arithmetic operations (that in
general require multiple precision arithmetic). The result is obtained by
converting the new state to floating point and dividing by a constant (e.g.,
one less than the modulus) to normalize the range so that its maximum value is
1.0 (or by the modulus itself, or an even larger value, to give a maximum of
less than 1.0).
My implementation, like the one in the Boeing Computer Services library (I
believe), does not use integer arithmetic and does not represent the state as a
32-bit integer. Instead, the state is a floating point number with a precision
of at least 31 bits, so that the integers from 1.0 to 2_147_483_646.0 can all
be represented exactly. The arithmetic that is performed to obtain the new
state creates a product that is the floating point representation of an integer
value requiring at most 46 bits of precision. Thus, customary 64-bit double
precision hardware suffices.
Resetting the state from an initiator has to allow all Integer values and has
to map the initiator value into a state value in the range 1 .. 2**31-2. This
is accomplished with an integer mod and an addition. I have chosen to generate
and throw away five random numbers after setting the state so that the first
random number called for by the user after setting the state with an initiator
will not appear highly correlated with the initiator value.
Resetting the state without an initiator computes a state using the current
value of the system clock. The values obtained by Splitting the value returned
by Clock are used in an appropriate mixed-radix computation to yield a state
value between 26_000_700 and 1_873_119_999. This version of Reset yields
different states for two calls spaced at least a second and not more than fifty
years apart.
The realization offered here ASSUMES that the type Float corresponds to double
precision, as it does in Verdix implementations of Ada 83 (at least the ones
I've used). This realization is the simplest. It uses the type Integer
(ASSUMED to be at least 32 bits, again OK in Verdix implementations) for an
intermediate calculation. The conversion to and from Integer is only
incidental, not required; notes provided later discuss how it can be removed.
One can imagine how the realization shown below can be made more portable, if
desired, by removing the assumptions on Float and Integer, by declaring types
with the required ranges and precisions, and by including the appropriate
explicit conversions among these types. Such a realization would only require
that the implementation HAVE predefined types with the required precision and
ranges, and would not require that they be named Float and Integer. I won't
amplify further on this here.
Here is a realization of Numerics.Random_Numbers incorporating a linear
congruential generator:
with Ada.Finalization; use Ada.Finalization;
package Ada.Numerics.Random_Numbers is
type Generator is limited private;
subtype Uniformly_Distributed is Float range 0.0 .. 1.0;
function Random (Gen : Generator) return Uniformly_Distributed;
function Random_Integer (Gen : Generator;
Low, High : Integer) return Integer;
procedure Reset (Gen : in Generator;
Initiator : in Integer);
procedure Reset (Gen : in Generator);
private
type Access_State is access Float;
Initial_State : Float := 30_000_000.0; -- Fixed but arbitrary.
type Generator is new Limited_Controlled with
record
State : Access_State := new Float'(Initial_State);
end record;
procedure Finalize (Gen : in out Generator);
Modulus : constant := 2_147_483_647.0; -- 2**31-1
Modulus_M1 : constant := Modulus - 1.0; -- 2**31-2
-- These two named numbers are in the private part only because of the
-- desire to use Modulus_M1 in the child package
-- Ada.Numerics.Random_Numbers.Extensions, rather than repeat its value
-- there.
end Ada.Numerics.Random_Numbers;
with Calendar; use Calendar;
with Unchecked_Deallocation;
package body Ada.Numerics.Random_Numbers is
Multiplier : constant := 16_807.0: -- 7**5
Int_Mod_M1 : constant Integer := Integer(Modulus_M1);
procedure Destroy_State is
new Unchecked_Deallocation (Float, Access_State);
function Random (Gen : Generator) return Uniformly_Distributed is
T, U : Float;
I : Integer;
begin
T := Gen.State.all * Multiplier; -- an integer of at most 46 bits
I := Integer(T / Modulus); -- integer part of T / Modulus; may be
-- too big by 1 (i.e., if rounded up)
U := T - Float(I) * Modulus; -- T mod Modulus; may be too small by
-- Modulus (i.e., if I is too big by 1),
-- in which case it will be negative
if U < 0.0 then
U := U + Modulus; -- now T mod Modulus
end if;
Gen.State.all := U;
return U / Modulus;
end Random;
function Random_Integer (Gen : Generator;
Low, High : Integer) return Integer is
Spread : Positive := High - Low + 1;
-- Above propagates Constraint_Error if it overflows or if
-- Low > High.
begin
return Low + Integer(Float(Spread) * Random(G)) mod Spread;
end Random_Integer;
procedure Reset (Gen : in Generator; Initiator : in Integer) is
T : Float;
begin
Gen.State.all := Float(Initiator mod Int_Mod_M1 + 1);
for J in 1 .. 5 loop
T := Random(Gen);
end loop;
end Reset;
procedure Reset (Gen : in Generator) is
Yr : Year_Number;
Mo : Month_Number;
Dy : Day_Number;
Se : Day_Duration;
S : Natural range 0 .. 86_400;
Sec : Natural range 0 .. 59;
Min : Natural range 0 .. 59;
Hr : Natural range 0 .. 23;
T : Natural;
begin
Split(Clock, Yr, Mo, Dy, Se);
S := Natural(Se);
Sec := S mod 60;
S := S / 60;
Min := S mod 60;
Hr := S / 60;
T := ((((Sec*60 + Min)*24 + Hr)*32 + Dy)*13 + Mo)*50 +
(Yr mod 50) + 26_000_000;
Gen.State.all := Float(T);
end Reset;
procedure Finalize (Gen : Generator) is
begin
Destroy_State (Gen.State);
end Finalize;
end Ada.Numerics.Random_Numbers;
The integer calculations can be omitted from Random in several ways. One can
write its body as follows:
T := Gen.State.all * Multiplier;
U := T - Float'Floor(T / Modulus) * Modulus;
if U < 0.0 then
U := U + Modulus;
end if;
Gen.State.all := U;
return U / Modulus;
Unfortunately, the if-test on U is still required because T / Modulus can
conceivably round up to an integer before its Floor is taken. (Some heavy
numerical analysis might prove that this can't happen, however.)
Alternatively, one can write, of course,
U := Float'Remainder(Gen.State.all * Multiplier, Modulus);
if U < 0.0 then
U := U + Modulus;
end if;
Gen.State.all := U;
return U / Modulus;
but unless Float'Remainder is computed in hardware, this is unlikely to be
as efficient as the previous alternatives. (The if-test in this case converts
the remainder, which may be negative, to a mod. Note that the remainder cannot
be zero.)
One can guarantee that the value 1.0 will never be returned, and that the
maximum return value will be any desired value less than 1.0, by appropriately
increasing the divisor in the return statement of Random. In a portable
implementation, that divisor can even be determined empirically with an
appropriate computation in the statement sequence of Numerics.Random_Numbers.
One could, if one chose, substitute Modulus_M1 for Modulus as the divisor in
the return statement, which would GUARANTEE that 1.0 would be returned if
sufficiently many calls are made on the Random. None of these perturbations of
the code introduce any significant bias in the results, at least as measured by
the craps tests.
Here is a realization of the part of Numerics.Random_Numbers.Extensions that is
required of all implementations supporting the Numerics Annex. The State type
is just a copy of Float, so states can be saved and restored by simple
assignments (with no-op explicit conversions). Since state values are
floating point representations of integers, I have chosen to represent the
image of a state as an integer image. This isn't necessary, of course, but it
may be more convenient for the user, and it is certainly easy to implement.
package Ada.Numerics.Random_Numbers.Extensions is
type State is private;
procedure Save (Gen : in Generator;
To_State : out State);
procedure Reset (Gen : in Generator;
From_State : in State);
Max_Image_Width : constant := Integer'Width;
function Image (Of_State : State) return String;
function Value (Coded_State : String) return State;
private
type State is new Float;
end Ada.Numerics.Random_Numbers.Extensions;
package body Ada.Numerics.Random_Numbers.Extensions is
procedure Save (Gen : in Generator; To_State : out State) is
begin
To_State := State(Gen.State.all);
end Save;
procedure Reset (Gen : in Generator; From_State : in State) is
begin
Gen.State.all := Float(From_State);
end Reset;
function Image (Of_State : State) return String is
begin
return Integer'Image(Integer(Of_State));
end Image;
function Value (Coded_State : String) return State is
T : State range 1.0 .. Modulus_M1;
begin
T := State(Integer'Value(Coded_State));
-- Propagate Constraint_Error if raised by Integer'Value or by
-- the constraint check in the assignment to T. The latter occurs
-- when the integer in the Coded_State is outside the allowed
-- range (min = 1; max = 2**31-2).
return T;
end Value;
end Ada.Numerics.Random_Numbers.Extensions;
2.2. Subtract-With-Borrow Fibonacci Generator
==================== ========= =========
A subtract-with-borrow Fibonacci generator is realized below. The particular
instance of this type of generator that I have chosen is one whose state is a
vector of 25 floating point numbers (in fact, the 25 previous outputs) and a
"borrow" value, which is also a floating point number. Each of the 25 numbers
in the vector is a scaled integer, an integer in the range 0.0 .. 2.0**24-1.0
scaled down by (i.e., divided by) 2.0**24, as is the borrow value. The maximum
output value is therefore just under 1.0; it is 1.0-2.0**(-24). Zero is a
possible output value. The borrow value is always either 0.0 or 2.0**(-24).
These numbers are all exactly representable in a floating point type with at
least 24 bits of precision.
The "next" output value is obtained by subtracting the 11th previous output
value and the borrow value from the 25th previous output value, adding 1.0 if
the result is negative. (Thus, this generator has "lags" of 25 and 11.) The
result is again in the range 0.0 .. 1.0-2.0**(-24). The vector of 25 previous
outputs is managed as a cyclical list, with a pair of indices identifying the
11th and 25th previous outputs. The new output value replaces the 25th (i.e.,
the oldest) previous output value in the list, becoming the youngest, and the
two indices are adjusted (they are decremented, wrapping around when they reach
the beginning of the list; modular arithmetic is used). A new borrow value is
also computed; it is 2.0**(-24) if 1.0 was added in obtaining the new output
value, and 0.0 otherwise.
Different instances of this kind of generator have different lags and/or
vectors of different sizes. There are also analogous add-with-carry Fibonacci
generators.
The realization shown here ASSUMES, once again, that Float has adequate
precision. In this case, even single precision suffices (on IEEE machines).
Note that neither the output value nor any intermediate result leading to it is
ever held in Integer form. There is no conversion between Float and Integer at
any point, and there are no multiplications or divisions. This generator is
significantly faster than the linear congruential generator. Provided that the
generator is not started from a state in which all the components of the
vector, as well as the borrow value, are zero, or all the components of the
vector are 1.0-2.0**(-24), and the borrow value is 2.0**(-24), both of which
reproduce the same output indefinitely, the generator has a period of about
10**178. (Compare this to the period of the linear congruential generator
given previously.)
It is permissible to initialize all 25 vector components to the same value, and
the borrow value arbitrarily, but that is unwise (the first generated output
will be either 0.0 or 1.0-2.0**(-24), depending on the initial borrow value;
this is not very random). Note that the initial values have to be among the
set of permissible floating point values (the set of scaled integers described
previously). One could write out the initial value as a suitably "random
looking" static aggregate, but what I do is to compute an initial value (in the
statement sequence of the package body), which requires that the body of
Numerics.Random_Numbers be elaborated before any unit that uses the random
number generator; this is achieved by including an Elaborate_Body pragma in the
spec. The 25 vector components are built up bit by bit, randomly, using an
internal linear congruential generator (based on the same algorithm as the one
shown previously); the initial borrow is just set arbitrarily to zero.
Here is a realization of Ada.Numerics.Random_Numbers incorporating the
Fibonacci generator described above.
with Ada.Finalization; use Ada.Finalization;
package Ada.Numerics.Random_Numbers is
pragma Elaborate_Body;
type Generator is limited private;
subtype Uniformly_Distributed is Float range 0.0 .. 1.0;
function Random (Gen : Generator) return Uniformly_Distributed;
function Random_Integer (Gen : Generator;
Low, High : Integer) return Integer;
procedure Reset (Gen : in Generator;
Initiator : in Integer);
procedure Reset (Gen : in Generator);
private
Larger_Lag : constant := 25;
Smaller_Lag : constant := 11;
type Lag_Range is mod Larger_Lag;
type State_Vector is array (Lag_Range) of Float;
type Internal_State is
record
Lagged_Outputs : State_Vector;
Borrow : Float;
R, S : Lag_Range;
end record;
type Access_State is access Internal_State;
Initial_State : Internal_State;
type Generator is new Limited_Controlled with
record
State : Access_State := new Internal_State'(Initial_State);
end record;
procedure Finalize (Gen : in out Generator);
end Ada.Numerics.Random_Numbers;
with Calendar; use Calendar;
with Unchecked_Deallocation;
package body Ada.Numerics.Random_Numbers is
procedure Destroy_State is
new Unchecked_Deallocation (Internal_State, Access_State);
-- The following function is used in this implementation to produce
-- a valid internal state for the Fibonacci generator based on an
-- integer that is a valid internal state for a linear congruential
-- generator. It uses the latter to generate random bits with which
-- to initialize the state vector.
function Make_Internal_State (Starter : Integer) return Internal_State is
Bit_Value : Float;
T : State_Vector;
LCG_State : Float;
LCG_Multiplier : constant := 16_807.0;
LCG_Modulus : constant := 2_147_483_647.0;
function LCG_Random return Uniformly_Distributed is
T : Float;
I : Integer;
begin
T := LCG_State * LCG_Multiplier;
I := Integer(T / LCG_Modulus);
LCG_State := T - Float(I) * LCG_Modulus;
if LCG_State < 0.0 then
LCG_State := LCG_State + LCG_Modulus;
end if;
return LCG_State / LCG_Modulus;
end LCG_Random;
begin
LCG_State := Starter;
for I in Lag_Range loop
State_Vector(I) := 0.0;
Bit_Value := 1.0;
for J in 1 .. 24 loop
Bit_Value := Bit_Value * 0.5;
if LCG_Random >= 0.5 then
State_Vector(I) := State_Vector(I) + Bit_Value;
end if;
end loop;
end loop;
return (Lagged_Outputs => State_Vector,
Borrow => 0.0, -- arbitrary
R => Larger_Lag - 1,
S => Smaller_Lag - 1);
end Make_Internal_State;
function Random (Gen : Generator) return Uniformly_Distributed is
U : Float;
begin
U := Gen.State.Lagged_Outputs(Gen.State.R) -
Gen.State.Lagged_Outputs(Gen.State.S) -
Gen.State.Borrow;
if U < 0.0 then
U := U + 1.0;
Gen.State.Borrow := 2#1.0#e-24;
else
Gen.State.Borrow := 0.0;
end if;
Gen.State.Lagged_Outputs(Gen.State.R) := U;
Gen.State.R := Gen.State.R - 1;
Gen.State.S := Gen.State.S - 1;
return U;
end Random;
function Random_Integer (Gen : Generator;
Low, High : Integer) return Integer is
Spread : Positive := High - Low + 1;
-- Above propagates Constraint_Error if it overflows or if
-- Low > High.
begin
return Low + Integer(Float(Spread) * Random(G)) mod Spread;
end Random_Integer;
procedure Reset (Gen : in Generator; Initiator : in Integer) is
begin
Gen.State.all := Make_Internal_State(Initiator mod Int_Mod_M1 + 1);
end Reset;
procedure Reset (Gen : in Generator) is
Yr : Year_Number;
Mo : Month_Number;
Dy : Day_Number;
Se : Day_Duration;
S : Natural range 0 .. 86_400;
Sec : Natural range 0 .. 59;
Min : Natural range 0 .. 59;
Hr : Natural range 0 .. 23;
T : Natural;
begin
Split(Clock, Yr, Mo, Dy, Se);
S := Natural(Se);
Sec := S mod 60;
S := S / 60;
Min := S mod 60;
Hr := S / 60;
T := ((((Sec*60 + Min)*24 + Hr)*32 + Dy)*13 + Mo)*50 +
(Yr mod 50) + 26_000_000;
Gen.State.all := Make_Internal_State(T);
end Reset;
procedure Finalize (Gen : in out Generator) is
begin
Destroy_State (Gen.State);
end Finalize;
begin
Initial_State := Make_Internal_State(30_000_000);
end Ada.Numerics.Random_Numbers;
Here is a realization of the part of Numerics.Random_Numbers.Extensions that is
required of all implementations supporting the Numerics Annex. The State type
is just a copy of Internal_State, so states can be saved and restored by simple
assignments (with no-op explicit conversions). The image of a state is the
concatenation of several components separated by commas. In order from left to
right, the components represent the oldest to the youngest components of the
state vector, followed by the borrow value. Since the components of a state
vector are scaled integers, as is a borrow value, I have chosen to represent
each component as an integer image. Because of the ordering of state
components in the image of a state, the R and S components of a state do not
need to be recorded in the image of the state. Note that the oldest component
of the state vector is the one whose index is R; younger components have
"smaller" indices, which wrap around after reaching zero, so that the youngest
component has index R+1. Ada.Strings.Bounded is used in the manipulation of
state images.
package Ada.Numerics.Random_Numbers.Extensions is
type State is private;
procedure Save (Gen : in Generator;
To_State : out State);
procedure Reset (Gen : in Generator;
From_State : in State);
Max_Image_Width : constant := (Larger_Lag + 1)*(Integer'Width + 1) - 1;
function Image (Of_State : State) return String;
function Value (Coded_State : String) return State;
private
type State is new Internal_State;
end Ada.Numerics.Random_Numbers.Extensions;
with Ada.Strings.Bounded; use Ada.Strings.Bounded;
package body Ada.Numerics.Random_Numbers.Extensions is
package Bounded_Length_Strings is
new Generic_Bounded_Length (Max_Image_Width + 1);
use Bounded_Length_Strings;
procedure Save (Gen : in Generator; To_State : out State) is
begin
To_State := State(Gen.State.all);
end Save;
procedure Reset (Gen : in Generator; From_State : in State) is
begin
Gen.State.all := Internal_State(From_State);
end Reset;
function Image (Of_State : State) return String is
Result : Bounded_String;
function Encode (Value : Float) return String is
begin
return Integer'Image(Integer(2#1.0#e24 * Value));
end Encode;
begin
Result := Null_Bounded_String;
for I in Lag_Range loop
Result :=
Result & Encode(Of_State.Lagged_Outputs(Of_State.R - I)) & ',';
end loop;
Result := Result & Encode(Of_State.Borrow);
return To_String(Result);
end Image;
function Value (Coded_State : String) return State is
use Ada.Strings;
Result : State;
T : Bounded_String;
procedure Decode_Component (Max : in Natural; Component : out Float) is
J : Natural;
K : Natural range 0 .. Max;
begin
J := Index(T, ',');
if J = 0 then -- Coded_State has too few commas
raise Constraint_Error; -- to be a valid state.
end if;
K := Integer'Value(Slice(T, 1, J-1));
-- Propagate Constraint_Error if raised by Integer'Value or by the
-- constraint check in the assignment to K. The latter occurs
-- when the integer in a component of the Coded_State is outside
-- the allowed range (min = 0; max = 2e24-1 for a component of the
-- state vector; max = 1 for the borrow component).
Component := Float(K) * 2#1.0#e-24;
Delete(T, 1, J);
end Decode_Component;
begin
begin
T := To_Bounded_String(Trim(Coded_State) & ',');
exception
when Length_Error => -- The trimmed Coded_State is too long
raise Constraint_Error; -- to be a valid state.
end;
for I in reverse Lag_Range loop
Decode_Component(2e24-1, Result.Lagged_Outputs(I));
end loop;
-- If Coded_State is well formed, exactly one comma remains in T
-- (the one tacked on at the end when T was initialized).
Decode_Component(1, Result.Borrow);
-- If Coded_State is well formed, the above consumes the rest of T.
if T /= Null_Bounded_String then -- Coded_State has too many commas
raise Constraint_Error; -- to be a valid state.
end if;
Result.R := Larger_Lag - 1;
Result.S := Smaller_Lag - 1;
return Result;
end Value;
end Ada.Numerics.Random_Numbers.Extensions;
3. Discussion of Major Alternatives
========== == ===== ============
I begin this section by discussing the rationale for the default position.
It is widely accepted that the majority of applications of random numbers will,
in fact, want random integers. An integer generator (Random_Integer) is
provided not because it is so difficult to provide this functionality otherwise
(i.e., if one only has the floating point generator, Random) but because of the
desire to make it extremely convenient for the user to access this
functionality. The significant majority of the comments on ada9x-mrt about
this issue have favored the inclusion of an integer generator.
The issue of the range of the floating point generator, Random--in particular,
whether it should exclude 1.0 (and, if so, by how much)--is mostly a separate
issue; however, the presence or absence of an integer generator does have a
bearing on the importance of the issue. I had originally sought uniformity
with Fortran's informal statement that the random number generator never
delivers 1.0, but there have been repeated suggestions that that alone is not
good enough. Some numbers close to 1.0 might behave in some contexts just like
1.0; indeed, a number slightly under 1.0 might be delivered as a value in an
extended register, which subsequently rounds up to 1.0 if it has to be
shortened to the storage format. I corresponded with Stuart Anderson of
Boeing, author of survey articles on random number generation, as early as
November 1992 about the conflict between the desire to avoid 1.0 and the
compromises that are required to do so.
I have discussed on ada9x-mrt the techniques that can be used to guarantee that
the result remains less than 1.0 (and behaves like a value less than 1.0), but
they involve a varying degree of compromise. In the case of a linear
congruential generator, there is no cost in time, because one just replaces the
divisor in the final step with a slightly larger divisor; however, the
Fibonacci generator normally performs no such division, so including one (if it
is required) represents a cost overhead. There have also been concerns that
avoiding the vicinity of 1.0 might bias the distribution somewhat, but I
believe that to be unobservable in practice. Then there were discussions about
how the vicinity of 1.0 should be avoided (if it is agreed that it should be):
by reducing the upper bound of Uniformly_Distributed, or by English phrases in
the description of Random? The most recent suggestion prior to this LSN was to
state that Random stayed sufficiently below 1.0 for certain expressions to
"work" reliably; one of these was involved with converting between floating
point random numbers and random integers, while the other was involved with
obtaining exponentially distributed random numbers.
Indeed, the desire to stay away from 1.0 was motivated largely by the problems
it posed for converting to random integers. Some of the techniques that one
can dream up do indeed fail if Random returns 1.0 or a number "sufficiently
close" to 1.0. But now that the user does not have to do that conversion, the
problem of returning 1.0 seems less important than before, in relation to the
compromises inherent in solving the problem. The semantic description of
Random therefore makes no statement about whether Random can return 1.0, and,
since 1.0 is in the range of the return subtype, the reader must conclude that
1.0 is a possible result (same for 0.0). This fact is underscored in the Notes
at the end of subclause C.5.2, which goes a step further to alert the user to
the problems that one or the other of the extreme values poses for transforming
the uniform distribution into an exponential distribution. The user is given
the necessary information to anticipate problems, with hints for their
solution, while at the same time the frequency of the potential problems is
greatly reduced by the provision of Random_Integer.
This position regarding the range of Random allows a wider range of natural
implementations.
I now discuss possible alternatives to the default position.
As I said above, the issue of the range of Random can be treated separately
from that of an integer generator. While retaining Random_Integer, one can
still adopt an alternative with regard to the range of Random. Among the
viable alternatives that we might choose are the following (I do not repeat all
their advantages and disadvantes here, since they have been cited often enough;
I merely list a menu of choices, with an occasional comment):
o Keep the range as it is now (the endpoints, 0.0 and 1.0, are possible
return values).
o Reduce the upper bound of Uniformly_Distributed to
1.0 - Float'Model_Epsilon. (This "looks awkward.")
o Leave Uniformly_Distributed alone, and go back to the Fortran rule
("never returns 1.0"). (Is this really any help?)
Note that the provision of Random_Integer eliminates the motivation for my
previous position (that is, that Random returns a value sufficiently less than
1.0 for certain conversion strategies to work correctly). However, even if
Random_Integer should be taken back out, there is no need to reconsider the
alternative represented by my previous position, because we now have a simple,
portable, reliable conversion technique (the one I actually used in the
implementations of Random_Integer in sections 2.1 and 2.2, credited to Robert
Eachus) that does not depend on Random's avoidance of 1.0.
The provision of Random_Integer is certainly not without its compromises and
warts. We might still want to consider whether it is worth having, or, if it
is agreed that it is, whether it should have a different form and/or different
semantics.
I do not yet know how to make an implementation of Random_Integer general
enough to accommodate any integer range, no matter how wide (for example, the
range Integer'First .. Integer'Last). There is no difficulty with its
implementation when the range is more modest, as I discussed and demonstrated
above. This problem hints at a broader issue, that of the weakness inherent in
assuming that Random_Integer should be implemented in terms of Random. I
return to this subject later.
One can argue that providing Random_Integer as a nongeneric function is not
very Ada-like. It would certainly be more in the spirit of Ada to make
Random_Integer generic with respect to the integer subtype of its result range.
This would make it unnecessary to pass the bounds on every call of the
function. One could go further and allow any discrete subtype (the function
should be named Random_Discrete in that case), so that one can conveniently
obtain random characters, random Boolean values, etc. I have avoided doing
this in the default recommendation only because of a strong desire to keep the
entire random number facility simple enough for the Fortran convert to
understand how to use it without having to learn new language concepts (like
generic instantiation). We might still decide that holding on to that goal
imposes too heavy a compromise on the "Ada-ness" of this part of the language.
When there was only Random, one could view objects of type Generator in terms
of a high-quality abstraction: they were associated with a sequence of random
numbers and had a state, which effectively captured the current position in
that sequence. Random was the operation that advanced the state and delivered
the next number from the sequence; other operations saved and restored states,
etc. This notion is weakened somewhat by the inclusion of Random_Integer, at
least if it shares the Generator type with Random. Is the sequence of random
numbers a floating point sequence or an integer sequence? One has to say that
it is both at the same time, and the two sequences correspond; one can ask
either for the floating point form of the next value in the sequence, or the
corresponding integer. But what integer corresponds to a given floating point
member of the sequence? That in turn depends on the range specified at the
time it is requested. It gets even muddier if one generalizes a generic
Random_Integer into Random_Discrete. I believe that the quality of the
abstraction can be upheld if the generator type is made a product of the
generic instantiation, along with all the operations on the type. This clearly
escalates the complexity, however, which is why I did not venture this far.
The form in which Random_Integer has been conceived has been recommended by
several people, including some on the MRT.
The problem with the weakening of the abstraction is even evident in the
confusion between viewing the type Generator as embodying the actual generator
(in an OO-like fashion) and then referring to Random_Integer as an "integer
generator," as I have done repeatedly in this LSN (but fortunately not in the
citations from the RM).
It is not too hard to imagine that an integer generator should be implemented
very differently from a floating point generator, suggesting that they should
indeed have distinct generator types (and distinct state types in the child
package). One might not want to use a floating point internal representation
for the former. Distinctly different implementation strategies would probably
ameliorate the problem with excessively wide integer ranges that I discussed
above.
Summarizing, here is a fairly complete menu of alternatives with respect to the
integer generator:
o Keep it just as is.
o Add a restriction on the width of the range that eases the burden on the
implementor and justifies the implementation I showed. (This would be a
wart on the language, however.)
o Make Random_Integer generic on an integer subtype.
o Make it generic on a discrete subtype and call it Random_Discrete.
(Though I haven't said so previously, it would probably be a good idea
to change Random to Random_Real or Random_Float in that case.)
o If made generic, make the generic unit a generic package that defines
the generator type and all the operations of the type.
o Abandon Random_Integer and just have Random. In view of Robert's clever
expression, this need not be a reason to go back to a position in which
Random avoids returning 1.0. However, in this case, I would expand the
last Note at the end of subclause C.5.2 so that it reads as follows:
Whether Random can actually deliver any particular value in its range
is implementation dependent. Applications should assume that the
bounds of the range, or values sufficiently close to them to behave
indistinguishably from them, can occur. Exponentially distributed
(floating point) random numbers with mean and standard deviation 1.0
can be obtained by evaluating either the expression -Log(Random(G)) or
the expression -Log(1.0 - Random(G)). Note that Log raises
Numerics.Argument_Error when its parameter is zero, so the expression
should be computed as the return expression of a function that has a
handler for Argument_Error; the handler should return Float'Last, or
some other large value appropriate to the application. The potential
problems that the value 1.0 poses for the conversion of the output of
Random into the integer range 0 .. N-1, for N > 0, can be avoided by
using the expression Integer(Float(N) * Random(G)) mod N to perform
this transformation; this expression works reliably even when Random
delivers 1.0.
However, I would say no more about "wide" integer ranges. The user who
tries to express a very wide range by subtracting a term from that
expression to shift both bounds left, and then returning the upper bound
to its former position by increasing N by the same value, would risk
producing a subexpression that overflows, but at least the cause of the
overflow would be patently obvious in the user's program (rather than
hidden away in a subprogram in a library). In any case, I should stress
again that I do not think such wide ranges are common in practice, so
leaving the problem unaddressed is probably realistic.
4. A Test Program that Implements the Craps Tests
= ==== ======= ==== ========== === ===== =====
In this section I include a complete, working program (promised some time ago)
to implement the craps tests described in RM9X-K.2.5;4.0. I wrote it initially
several months ago (for Ada 83), but I have revised it to be a little more
sophisticated, and of course to include the correct context clause for Ada 9X.
This program plays a million craps games, using random numbers to simulate the
rolls of a pair of dice, and counts and reports various occurrences of things.
Chi-square statistics are computed for the occurrences and compared to
tabulated values. Statistics are reported for the distribution of wins and
losses, distribution of outcomes of the roll of a single die (equidistribution
test), distribution of game lengths, and distribution of pass lengths (number
of consecutive games won before a loss).
The program generates a different sequence of random numbers each time it is
run. Therefore, its behavior is not repeatable. To obtain repeatable
behavior, comment out the call to Reset.
On a Sequent Symmetry (Intel 80386/387), the program takes about 15 minutes to
run with a linear congruential generator and about 11 1/2 minutes to run with a
subtract-with-borrow Fibonacci generator. It prints a progress message after
every 50,000 games. For experimentation, one can reduce the number of games
to, say, 100,000. The tail of the distributions of game lengths and pass
lengths become too small to be statistically meaningful with so "few" (!) games
played, but the program has been adapted to lump observations in the tail of
the distribution to maintain statistical significance.
The program expects no input (the number of games to be played really should be
an input parameter) and writes its output on standard output. An example of
the output for a subtract-with-borrow Fibonacci generator follows:
Length Probability of a game of the given length
1 0.3333333
2 0.1882716
3 0.1347737
4 0.0965673
5 0.0692571
6 0.0497177
7 0.0357251
8 0.0256954
9 0.0184993
10 0.0133315
11 0.0096166
12 0.0069437
13 0.0050186
14 0.0036307
15 0.0026292
16 0.0019058
17 0.0013827
18 0.0010041
19 0.0007299
20 0.0005311
> 20 0.0014356
Length Probability of a pass of the given length (# of wins)
0 0.50707071
1 0.24995001
2 0.12320768
3 0.06073267
4 0.02993691
5 0.01475678
6 0.00727405
7 0.00358559
8 0.00176744
9 0.00087122
10 0.00042945
11 0.00021169
12 0.00010435
13 0.00005144
14 0.00002535
> 14 0.00002465
Starting to play 1000000 craps games
Played 50000 games
Played 100000 games
Played 150000 games
Played 200000 games
Played 250000 games
Played 300000 games
Played 350000 games
Played 400000 games
Played 450000 games
Played 500000 games
Played 550000 games
Played 600000 games
Played 650000 games
Played 700000 games
Played 750000 games
Played 800000 games
Played 850000 games
Played 900000 games
Played 950000 games
Played 1000000 games
Number of rolls of a single die: 6763638
Number of craps games played: 1000000
Number of passes counted: 506923
Distribution of wins and losses
Outcome Times expected Times observed
Win 492929.3 493077
Loss 507070.7 506923
Calculated chi-square for distribution of wins and losses: 0.087
Tabulated value for 1 degree of freedom (alpha=0.05): 3.841
The generator PASSES this test
Distribution of dice rolls
Roll Times expected Times observed
1 1127273.0 1127746
2 1127273.0 1128403
3 1127273.0 1128030
4 1127273.0 1125209
5 1127273.0 1126770
6 1127273.0 1127480
Calculated chi-square for distribution of dice rolls: 5.881
Tabulated value for 5 degrees of freedom (alpha=0.05): 11.070
The generator PASSES this test
Distribution of game lengths
Length Times expected Times observed
1 333333.3 332785
2 188271.6 188264
3 134773.7 134725
4 96567.3 95946
5 69257.1 69560
6 49717.7 50193
7 35725.1 35825
8 25695.4 25792
9 18499.3 18505
10 13331.5 13366
11 9616.6 9583
12 6943.7 7068
13 5018.6 5072
14 3630.7 3675
15 2629.2 2546
16 1905.8 1991
17 1382.7 1400
18 1004.1 1022
19 730.9 713
20 531.1 486
> 20 1435.6 1483
Calculated chi-square for distribution of game lengths: 27.735
Tabulated value for 20 degrees of freedom (alpha=0.05): 31.410
The generator PASSES this test
Distribution of pass lengths
Length Times expected Times observed
0 257045.8 256843
1 126705.4 126665
2 62456.8 62748
3 30786.8 30847
4 15175.7 15111
5 7480.6 7438
6 3687.4 3660
7 1817.6 1808
8 896.0 922
9 441.6 465
10 217.7 205
11 107.3 119
12 52.9 42
13 26.1 25
14 12.9 14
> 14 12.5 11
Calculated chi-square for distribution of pass lengths: 8.997
Tabulated value for 15 degrees of freedom (alpha=0.05): 24.996
The generator PASSES this test
The generator PASSED all the tests
In other runs, the four calculated chi-square values were as follows (the
preceding run is relisted as run 1 below, and the tabulated values are listed
in the last row):
Run Wins & losses Dice rolls Game lengths Pass lengths
1 0.087 5.881 27.735 8.997
2 0.000 3.982 14.858 13.997
3 0.000 5.745 27.386 19.930
4 0.227 7.285 20.860 10.251
5 1.895 2.814 25.102 27.773
table 3.841 11.070 31.410 24.996
It is evident that run 5 did not pass the pass lengths test. It is in the
nature of this kind of statistic that the tabulated chi-square value will be
exceeded occasionally.
The game-length and pass-length tests are extremely demanding and are capable
of revealing subtle long-term correlations in the random numbers (apparently).
Here are the results of five runs made with a subtract-with-borrow Fibonacci
generator employing a state vector of 24 components and lags of 24 and 10
(another combination reported in the literature to be "good"). This generator
fails more often, including once with respect to wins & losses.
Run Wins & losses Dice rolls Game lengths Pass lengths
1 1.198 3.494 37.722 18.782
2 0.331 2.101 24.552 31.410
3 0.034 2.028 41.908 20.283
4 0.877 3.097 31.231 22.611
5 4.536 2.677 27.444 20.360
table 3.841 11.070 31.410 24.996
Here are the results of five runs made with a subtract-with-borrow Fibonacci
generator employing a state vector of 28 components and lags of 28 and 8
(another combination reported in the literature to be "good"). This generator
also has occasional failures.
Run Wins & losses Dice rolls Game lengths Pass lengths
1 0.017 3.124 41.214 19.604
2 1.181 4.587 23.858 15.410
3 0.115 3.368 16.370 7.436
4 1.718 1.698 39.358 16.336
6 0.117 3.725 15.407 15.600
table 3.841 11.070 31.410 24.996
The final subtract-with-borrow Fibonacci generator from the literature that I
tested has a state vector of 39 components and lags of 39 and 25. Here are its
results, which also exhibit occasional failures.
Run Wins & losses Dice rolls Game lengths Pass lengths
1 0.031 5.952 21.127 8.658
2 0.744 4.329 34.678 6.455
3 0.026 3.916 20.979 10.745
4 0.004 11.401 19.599 21.728
5 0.214 5.610 20.530 18.542
table 3.841 11.070 31.410 24.996
Finally, here are the results of five runs of the linear congruential generator
whose implementation was shown above. No failures occurred for this series of
runs.
Run Wins & losses Dice rolls Game lengths Pass lengths
1 0.638 0.816 28.499 9.378
2 0.528 4.066 27.787 11.224
3 0.005 6.278 17.311 20.863
4 0.125 2.108 10.165 12.300
5 0.245 4.098 15.073 12.231
table 3.841 11.070 31.410 24.996
The complete code for Craps_Tests follows:
with Ada.Numerics.Random_Numbers; use Ada.Numerics.Random_Numbers;
with Text_IO; use Text_IO;
procedure Craps_Tests is
-- This test program written by Ken Dritz, Argonne National Laboratory.
-- Adapted January 30, 1994 from earlier versions.
-- Except for the first context clause above, this program is written
-- entirely in Ada 83. It does not use the Max and Min attributes where
-- it could, and it does not use the Random_Integer function to represent
-- dice rolls (but rather converts the output of Random to an integer value
-- in the range 1 .. 6). The only subprograms used from Random_Numbers are
-- Random and Reset (without an initiator).
-- The floating point type used throughout this program is Float. It has
-- been verified that the program is not sensitive to whether Float
-- corresponds to double or single precision.
-- Rules of craps:
--
-- If the player rolls 7 or 11 on the first roll (of a pair of dice), the
-- player wins.
--
-- If the player rolls 2, 3, or 12 on the first roll, the player loses.
--
-- Otherwise, call the number showing on the first roll of the dice the
-- "point." The player continues to roll. If the player rolls the point
-- again before rolling a 7, the player wins (otherwise the player loses).
subtype Roll_Outcome is Integer range 1 .. 6;
subtype Pair_Outcome is Integer range 2 .. 12;
Games_Played : Natural := 1000000;
-- The number of games to be played is at least the initial value of
-- Games_Played. After they are played, if the last game was not a loss,
-- a few more games are played (until a loss occurs), thus completing the
-- final "pass" (series of wins ending in a loss).
Passes_Counted : Natural := 0;
G : Generator;
-- The following are the probabilities of winning and losing a single game
-- of craps.
Prob_Of_Win : constant := 244.0 / 495.0;
Prob_Of_Loss : constant := 251.0 / 495.0;
-- The following are the probabilities of the outcome of each roll of a
-- pair of dice.
Prob_Of_Pair : constant array (Pair_Outcome) of Float :=
(1.0/36.0, 2.0/36.0, 3.0/36.0, 4.0/36.0, 5.0/36.0, 6.0/36.0,
5.0/36.0, 4.0/36.0, 3.0/36.0, 2.0/36.0, 1.0/36.0);
-- The following are the probabilities that a game has length i, for each
-- i from 1 to 20, or a length greater than or equal to 21. (This array
-- is initialized by code later.) Thus, the expected number of games of
-- length i in N games played is N * Prob_Of_Game_Of_Length(i).
Prob_Of_Game_Of_Length : array (1 .. 21) of Float;
-- The following are the probabilities that a pass has length i (i.e., that
-- it consists of i wins followed by a loss), for each i from 0 to 14, or a
-- length greater than or equal to 15. (This array is initialized by code
-- later.) Thus, the expected number of passes of length i in N passes
-- counted is N * Prob_Of_Pass_Of_Length(i).
Prob_Of_Pass_Of_Length : array (0 .. 15) of Float;
Number_Of_Rolls : Natural := 0;
Pass_Length : Natural := 0;
type Expected_Counts is array (Integer range <>) of Float;
type Observed_Counts is array (Integer range <>) of Natural;
Expected_Wins_And_Losses : Expected_Counts (1 .. 2);
Observed_Wins_And_Losses : Observed_Counts (1 .. 2) := (others => 0);
Expected_Rolls_Of_Kind : Expected_Counts (Roll_Outcome);
Observed_Rolls_Of_Kind : Observed_Counts (Roll_Outcome) := (others => 0);
Expected_Games_Of_Length : Expected_Counts (1 .. 21);
Observed_Games_Of_Length : Observed_Counts (1 .. 21) := (others => 0);
Expected_Passes_Of_Length : Expected_Counts (0 .. 15);
Observed_Passes_Of_Length : Observed_Counts (0 .. 15) := (others => 0);
Win_Or_Loss : array (1 .. 2) of String (1 .. 5) := (" Win", " Loss");
Float_Games : Float;
Float_Passes : Float;
Passes_All_Tests : Boolean := True;
package My_Int_IO is new Integer_IO (Integer);
use My_Int_IO;
package My_Float_IO is new Float_IO (Float);
use My_Float_IO;
function Roll return Roll_Outcome is
Outcome : Roll_Outcome;
begin
Number_Of_Rolls := Number_Of_Rolls + 1;
Outcome := 1 + Integer(6.0*Random(G)) mod 6;
Observed_Rolls_Of_Kind(Outcome) := Observed_Rolls_Of_Kind(Outcome) + 1;
return Outcome;
end;
procedure Record_Pass is
begin
if Pass_Length > 15 then
Pass_Length := 15;
end if;
Observed_Passes_Of_Length(Pass_Length) :=
Observed_Passes_Of_Length(Pass_Length) + 1;
Passes_Counted := Passes_Counted + 1;
Pass_Length := 0;
end;
procedure Record_Win is
begin
Observed_Wins_And_Losses(1) := Observed_Wins_And_Losses(1) + 1;
Pass_Length := Pass_Length + 1;
end;
procedure Record_Loss is
begin
Observed_Wins_And_Losses(2) := Observed_Wins_And_Losses(2) + 1;
Record_Pass;
end;
procedure Play_Crap_Game is
P1, P2 : Pair_Outcome;
Game_Length : Natural := 1;
begin
P1 := Roll + Roll;
if P1 = 7 or P1 = 11 then
Record_Win;
elsif P1 < 4 or P1 = 12 then
Record_Loss;
else
loop
P2 := Roll + Roll;
Game_Length := Game_Length + 1;
if P2 = P1 then
Record_Win;
exit;
elsif P2 = 7 then
Record_Loss;
exit;
end if;
end loop;
end if;
if Game_Length > 21
then Game_Length := 21;
end if;
Observed_Games_Of_Length(Game_Length) :=
Observed_Games_Of_Length(Game_Length) + 1;
end;
procedure Print_Chi_Square (Expected : in out Expected_Counts;
Observed : in out Observed_Counts;
Title : in String;
Col_Head : in String) is
Chi_Square : Float := 0.0;
Last_Row : Natural range Expected'Range;
Degrees : Natural;
Chi_Table : constant array (1 .. 30) of Float :=
( 3.841, 5.991, 7.815, 9.488, 11.070, 12.592, 14.067, 15.507,
16.919, 18.307, 19.675, 21.026, 22.362, 23.685, 24.996, 26.296,
27.587, 28.869, 30.144, 31.410, 32.671, 33.924, 35.172, 36.415,
37.652, 38.885, 40.113, 41.337, 42.557, 43.773);
begin
-- The expected counts are either all about the same (and sufficiently
-- large), or the ones near the end of the array approach zero. Make
-- sure that the expected counts near the end of the array are at least
-- 5.0 (for meaningful statistics). Do this by iteratively combining
-- rows at the end of the arrays, effectively shortening the array,
-- until the condition is achieved.
Last_Row := Expected'Last;
while Expected(Last_Row) < 5.0 loop
Expected(Last_Row - 1) := Expected(Last_Row - 1) + Expected(Last_Row);
Observed(Last_Row - 1) := Observed(Last_Row - 1) + Observed(Last_Row);
Last_Row := Last_Row - 1;
end loop;
Degrees := Last_Row - Expected'First;
-- Now print the expected vs. observed counts, simultaneously calculating
-- the chi square value.
Put ("Distribution of ");
Put_Line (Title);
Put (Col_Head);
Put_Line (" Times expected Times observed");
for I in Expected'First .. Last_Row loop
if Col_Head = "Outcome" then
Put (Win_Or_Loss(I));
else
if I < Last_Row or Col_Head = "Roll" then
Put (I, Width => 1 + Col_Head'Length/2);
else
Put ('>');
Put (I-1, Width => Col_Head'Length/2);
end if;
end if;
Set_Col (Col_Head'Length + 4);
Put (Expected(I), Fore => 9, Aft => 1, Exp => 0);
Set_Col (Col_Head'Length + 21);
Put (Observed(I), Width => 10);
New_Line;
Chi_Square :=
Chi_Square +
(Expected(I) - Float(Observed(I)))**2 / Expected(I);
end loop;
Put ("Calculated chi-square for distribution of ");
Put (Title);
Put (": ");
Set_Col (62);
Put (Chi_Square, Fore => 3, Aft => 3, Exp => 0);
New_Line;
Put ("Tabulated value for ");
Put (Degrees, Width => 1);
Put (" degree");
if Degrees > 1 then
Put ('s');
end if;
Put (" of freedom (alpha=0.05): ");
Set_Col (62);
Put (Chi_Table(Degrees), Fore => 3, Aft => 3, Exp => 0);
New_Line;
Put ("The generator ");
if Chi_Square <= Chi_Table(Degrees) then
Put ("PASSES");
else
Put ("FAILS");
Passes_All_Tests := False;
end if;
Put_Line (" this test");
New_Line;
end;
begin
-- Initialize and print the Prob_Of_Game_Of_Length array.
-- The probability that a game will have a length of 1 is the probability
-- of rolling a 7 (win), 11 (win), 2 (loss), 3 (loss), or 12 (loss) on the
-- first roll.
Prob_Of_Game_Of_Length(1) := Prob_Of_Pair(7) + Prob_Of_Pair(11) +
Prob_Of_Pair(2) + Prob_Of_Pair(3) +
Prob_Of_Pair(12);
Prob_Of_Game_Of_Length(21) := 1.0 - Prob_Of_Game_Of_Length(1);
-- Prob_Of_Game_Of_Length(21) will be reduced by the probability of a game
-- of each other specific length computed.
-- For I > 1, the probability that a game will have a length of I is the
-- probability of neither a win nor a loss on the first I-1 rolls times the
-- probability of either a win or a loss on the Ith roll.
for I in 2 .. 20 loop
Prob_Of_Game_Of_Length(I) := 0.0;
for Point in 4 .. 10 loop
if Point /= 7 then
Prob_Of_Game_Of_Length(I) := Prob_Of_Game_Of_Length(I) +
Prob_Of_Pair(Point) *
(1.0 - (Prob_Of_Pair(Point) + Prob_Of_Pair(7))) ** (I - 2) *
(Prob_Of_Pair(Point) + Prob_Of_Pair(7));
end if;
end loop;
Prob_Of_Game_Of_Length(21) :=
Prob_Of_Game_Of_Length(21) - Prob_Of_Game_Of_Length(I);
end loop;
Put_Line ("Length Probability of a game of the given length");
for I in 1 .. 21 loop
if I < 21 then
Put (I, Width => 4);
else
Put ("> 20");
end if;
Put (Prob_Of_Game_Of_Length(I), Fore => 22, Aft => 7, Exp => 0);
New_Line;
end loop;
New_Line;
-- Initialize and print the Prob_Of_Pass_Of_Length array.
-- The probability that a pass will have a length of I is the probability
-- of winning I games in a row times the probability of losing (in the
-- I+1st game).
Prob_Of_Pass_Of_Length(15) := 1.0;
-- Prob_Of_Pass_Of_Length(15) will be reduced by the probability of a pass
-- of each other specific length computed.
for I in 0 .. 14 loop
Prob_Of_Pass_Of_Length(I) := Prob_Of_Win**I * Prob_Of_Loss;
Prob_Of_Pass_Of_Length(15) :=
Prob_Of_Pass_Of_Length(15) - Prob_Of_Pass_Of_Length(I);
end loop;
Put_Line ("Length Probability of a pass of the given length (# of wins)");
for I in 0 .. 15 loop
if I < 15 then
Put (I, Width => 4);
else
Put ("> 14");
end if;
Put (Prob_Of_Pass_Of_Length(I), Fore => 28, Aft => 8, Exp => 0);
New_Line;
end loop;
New_Line;
-- Reset the generator to a time-dependent state, providing a unique
-- sequence of random numbers in each run.
Reset (G);
-- Play the minimum number of games.
Put ("Starting to play ");
Put (Games_Played, Width => 1);
Put_Line (" craps games");
New_Line;
for I in 1 .. Games_Played loop
Play_Crap_Game;
if I mod 50000 = 0 then
Put ("Played ");
Put (I, Width => 7);
Put_Line (" games");
end if;
end loop;
New_Line;
-- Play a few more, if necessary, to complete the current "pass."
while Pass_Length > 0 loop
Play_Crap_Game;
Games_Played := Games_Played + 1;
end loop;
Float_Games := Float(Games_Played);
Float_Passes := Float(Passes_Counted);
-- Initialize expected values needed for the statistics.
Expected_Wins_And_Losses(1) := Prob_Of_Win * Float_Games;
Expected_Wins_And_Losses(2) := Prob_Of_Loss * Float_Games;
for I in Roll_Outcome loop
Expected_Rolls_Of_Kind(I) := Float(Number_Of_Rolls) / 6.0;
end loop;
for I in 1 .. 21 loop
Expected_Games_Of_Length(I) := Prob_Of_Game_Of_Length(I) * Float_Games;
end loop;
for I in 0 .. 15 loop
Expected_Passes_Of_Length(I) := Prob_Of_Pass_Of_Length(I) * Float_Passes;
end loop;
-- Now compute and print the statistics.
Put ("Number of rolls of a single die: ");
Put (Number_Of_Rolls, Width => 1);
New_Line;
Put ("Number of craps games played: ");
Put (Games_Played, Width => 1);
New_Line;
Put ("Number of passes counted: ");
Put (Passes_Counted, Width => 1);
New_Line (2);
Print_Chi_Square(Expected_Wins_And_Losses,
Observed_Wins_And_Losses,
"wins and losses",
"Outcome");
Print_Chi_Square(Expected_Rolls_Of_Kind,
Observed_Rolls_Of_Kind,
"dice rolls",
"Roll");
Print_Chi_Square(Expected_Games_Of_Length,
Observed_Games_Of_Length,
"game lengths",
"Length");
Print_Chi_Square(Expected_Passes_Of_Length,
Observed_Passes_Of_Length,
"pass lengths",
"Length");
if Passes_All_Tests then
Put_Line ("The generator PASSED all the tests");
else
Put_Line ("The generator FAILED one or more tests");
end if;
end Craps_Tests;
One possible improvement worth making in the code is this: it should print the
image of the (random) initial state, so that the run can be repeated (perhaps
after modifying the generator) if it turns up anomalies. Thus, the program
should also have the capability of resetting the generator initially from a
user-supplied image of a state.