!topic LSN-1055 on Evolution of the Random Number Generator !key LSN-1055 on Evolution of the Random Number Generator !reference RM9X-C.5.2;4.0 !from Ken Dritz $Date: 94/01/31 12:14:43 $ $Revision: 1.4 $ !discussion I try in this LSN to reflect the current state of the discussions on the random number generator. I summarize in section 1 the current specification, accounting for the evolutionary improvements that seem to have been agreed to (or at least suggested and then not refuted by anyone, including me). In section 2, I show two example implementations. Then I discuss in section 3 the possible alternatives to the specification in section 1, regarding the range of Random and the presence or absence of an integer generator. Finally, I present, in section 4, a test program for the random number generator, embodying the craps tests. 1. Evolution of the Packaging of RNG Functionality ========= == === ========= == === ============= In RM9X;4.0, the declaration of Ada.Numerics.Random_Numbers was given as follows: package Ada.Numerics.Random_Numbers is type Generator is limited private; Number_Of_Seed_Components : constant := ; type Seed is array (1 .. Number_Of_Seed_Components) of Integer; Seed_Error : exception; subtype Uniformly_Distributed is Float range 0.0 .. 1.0; function Random (Gen : Generator) return Uniformly_Distributed; procedure Get_Seed (Gen : in Generator; Value : out Seed); procedure Set_Seed (Gen : in Generator; Value : in Seed); procedure Reset (Gen : in Generator; Initiator : in Integer); procedure Reset (Gen : in Generator); private -- implementation defined end Ada.Numerics.Random_Numbers; This package declaration was, of course, supplemented by various semantic rules, one of which was that Random "never delivers a result of 1.0." The following evolutionary improvements in the packaging of this package's functionality have been discussed, and I plan to make them: 1. The Seed type has become a private type and is now named State; the declaration of Number_Of_Seed_Components is, of course, omitted (93-3331.a). Furthermore, the State type is defined in a child package, called Ada.Numerics.Random_Numbers.Extensions (93-3331.a, 93-3343.a), which contains features relevant to advanced applications of the RNG (i.e., those with requirements that go beyond what Fortran supports). The child package is defined in the Numerics Annex (93-3334.a). 2. The names of the procedures Get_Seed and Set_Seed have been changed to Save and (another overloading of) Reset (93-3331.a, 93-3332.a, 93-3334.a), and the procedures have been moved to the child package (93-3343.a). Reset is the inverse of Save (93-3310,a). These procedures serve only to save (i.e., checkpoint) the current generator state and to reset a generator to a previously saved state, or to a state constructed by some other means (and which is checked for validity when it is constructed). Thus, the new overloading of Reset does not have to validate its state parameter. The declaration of Seed_Error is, of course, also omitted. 3. In addition to State, Save, and the new overloading of Reset, the child package must define at least the following entities: a. An Image function that maps each value of type State to a unique (but implementation-defined) value of type String (93-3331.a). b. A named number, called Max_Image_Width, that bounds the length of the string returned by any invocation of Image. c. A Value function such that Value(Image(A_State)) = A_State and Value of any string that is not the image of some generator state (after removing leading and trailing blanks) raises Constraint_Error. (93-3331.a suggested that a unique exception defined in the child package, analogous to Seed_Error, be raised. 93-3343.a gave that exception the name State_Error, but it also suggested that raising Numerics.Argument_Error might be appropriate. I now think that raising Constraint_Error is the most appropriate.) The child package may define other (implementation-defined) entities, such as a procedure to change the alterable parameters (if any) of a generator or an alternative to Value that might be more convenient to use for the purpose of constructing an arbitrary, validated, user-specified state (93-3331.a). (The primary purpose of Value is to allow one to enter data denoting a state, perhaps interactively, after having made a written note of such data, for example during interactive debugging, when it might have been obtained using the Image function.) I present below a rewrite of subclause C.5.2 that incorporates the foregoing changes and that takes a default position with respect to the range of Random and the presence or absence of an integer generator. (Alternatives that might be selected instead are discussed in section 3.) The default position--i.e. my current recommendation--is that an integer generator SHOULD be included, but that the range of Random (the floating point generator) should NOT be reduced to less than the full range of its return subtype. Thus, Random can conceivably return 0.0 (this does not represent a change) or 1.0 (this does), as well as a value different from 1.0 but sufficiently close to it to behave like 1.0 in some contexts. The integer generator is called Random_Integer; it is not generic, but takes two integer parameters, Low and High (with High >= Low), and returns a value of type Integer in the range Low .. High. It takes a parameter of the same generator type as Random. The reasons for these recommendations are given in section 3. C.5.2 Random Number Generation Basic facilities for the generation of pseudo-random numbers are provided by the package Numerics.Random_Numbers. These facilities include a limited private type each of whose objects serves as the generator of a (possibly distinct) sequence of random numbers; a pair of functions to obtain the ``next'' random number from a given sequence of random numbers (that is, from its generator)--one of them returning a floating point number in the range 0.0 .. 1.0 and the other returning an integer in a range specified by the user; and procedures to initialize or reinitialize a given generator. Additional facilities for advanced applications are provided, in implementations supporting the Numerics Annex, by a child package (see K.?.?). Static Semantics The library package Numerics.Random_Numbers has the following declaration: package Ada.Numerics.Random_Numbers is type Generator is limited private; subtype Uniformly_Distributed is Float range 0.0 .. 1.0; function Random (Gen : Generator) return Uniformly_Distributed; function Random_Integer (Gen : Generator; Low, High : Integer) return Integer; procedure Reset (Gen : in Generator; Initiator : in Integer); procedure Reset (Gen : in Generator); private -- implementation defined end Ada.Numerics.Random_Numbers; An object of the limited private type Generator is associated with a sequence of random numbers. Each generator has a hidden state, which it uses to determine the current position in its associated sequence. All generators are implicitly initialized to the same implementation-defined state; they may also be explicitly initialized, or reinitialized, to a time-dependent state or to a state uniquely denoted by an integer value. The operations on generators affect the state and therefore the future values of the associated sequence. The semantics of these operations are defined below. function Random (Gen : Generator) return Uniformly_Distributed; function Random_Integer (Gen : Generator; Low, High : Integer) return Integer; Obtains the ``next'' random number from the sequence associated with the given generator, relative to its current state, according to an implementation-defined algorithm. The result of Random is delivered as a value of the subtype Uniformly_Distributed, which is a subtype of the predefined type Float having a range of 0.0 .. 1.0. The result of Random_Integer is delivered as a value of the predefined type Integer in the range Low .. High; Constraint_Error is raised if this is a null range. A sufficiently long sequence of random numbers obtained by successive calls to Random or Random_Integer is approximately uniformly distributed over its range. procedure Reset (Gen : in Generator; Initiator : in Integer); procedure Reset (Gen : in Generator); Sets the state of the specified generator to a state which is an implementation-dependent function of the value of the parameter Initiator (or to a time-dependent state, if the parameter Initiator is not specified). Implementation Requirements Performance requirements for the random number generator, which apply only in implementations conforming to the Numerics Annex, and then only in the ``strict'' mode defined there (see K.2), are given in K.2.5. Documentation Requirements No one algorithm for random number generation is best for all applications. To enable the user to determine the suitability of the random number generator for the intended application, the implementation has to describe the algorithm used and has to give its period, if known exactly. or a lower bound on the period, if the exact period is unknown. The implementation also has to document the minimum time interval between calls to Reset (without an Initiator parameter) which are guaranteed to initiate different sequences. Implementation Permissions Examples of permissible algorithms for random number generation are: o instances of classical linear congruential generators; o combination generators (such as those of Wichmann and Hill or L'Ecuyer); o instances of the phenomenally long-period ``subtract-with-borrow'' or ``add-with-carry'' Fibonacci generators of Marsaglia and Zaman. Implementation Advice Any storage associated with an object of type Generator should be reclaimed on exit from the scope of the object. If the generator period is sufficiently long in relation to the number of distinct Initiator values, then each possible value of Initiator passed to Reset should initiate a sequence of random numbers that does not, in a practical sense, overlap the sequence initiated by any other value. If this is not possible, then the mapping between Initiator values and generator states should be a rapidly varying function of the Initiator value. NOTES If two or more tasks are to share the same generator, then the tasks have to synchronize their access to the generator as for any shared variable (see 9.10). Within a given implementation, a repeatable random number sequence can be obtained by relying on the implicit initialization of generators or by explicitly initializing a generator with a repeatable initiator value. Different sequences of random numbers can be obtained from a given generator in different program executions by explicitly initializing the generator without an initiator value. Whether Random can actually deliver any particular value in its range is implementation dependent. Applications should assume that the bounds of the range, or values sufficiently close to them to behave indistinguishably from them, can occur. Exponentially distributed (floating point) random numbers with mean and standard deviation 1.0 can be obtained by evaluating either the expression -Log(Random(G)) or the expression -Log(1.0 - Random(G)). Note that Log raises Numerics.Argument_Error when its parameter is zero, so the expression should be computed as the return expression of a function that has a handler for Argument_Error; the handler should return Float'Last, or some other large value appropriate to the application. The value 1.0 potentially poses problems for the conversion of the output of Random into some integer range; when random integers are required, Random_Integer should be used directly, because it avoids those potential problems. Supplementing the above material will be new material, in the Numerics Annex, defining the child package. It would be appropriate to include that material as clause K.2 and to renumber the current clause K.2 and those following as K.3 and higher. The new clause might read as follows. K.2 Advanced Features of the Random Number Generator Advanced facilities for the generation of pseudo-random numbers are provided by the package Numerics.Random_Numbers.Extensions. These facilities are concerned with saving and restoring the internal states of generators, either in a direct form for checkpointing and restarting a generator or in a string form (which may be convenient for interactive debugging of an application involving random numbers). Static Semantics The library package Numerics.Random_Numbers.Extensions has the following declaration: package Ada.Numerics.Random_Numbers.Extensions is type State is private; procedure Save (Gen : in Generator; To_State : out State); procedure Reset (Gen : in Generator; From_State : in State); Max_Image_Width : constant := ; function Image (Of_State : State) return String; function Value (Coded_State : String) return State; -- Other operations on generators and states, at the implementor's -- discretion private -- implementation defined end Ada.Numerics.Random_Numbers.Extensions; A value of the private type State represents the internal state of a generator. Operations on the State type can be used to obtain, restore, or examine the state of a generator. The semantics of these operations are defined below. procedure Save (Gen : in Generator; To_State : out State); procedure Reset (Gen : in Generator; From_State : in State); Save obtains the current state of a generator. Reset gives a generator the specified state. A generator that is reset to a state previously obtained by invoking Save is restored to the state it had when Save was invoked. function Image (Of_State : State) return String; function Value (Coded_State : String) return State; Image provides a representation of a state coded (in an implementation-defined way) as a string whose length is bounded by the value of Max_Image_Width. Value is the inverse of Image: Value(Image(S)) = S, while taking the Value of a string that is not the image of any state raises Constraint_Error. Implementation Requirements The mapping of states to strings performed by Image shall be one-to-one. Documentation Requirements Implementations shall document the mapping of states to strings and the mapping of strings to states. Implementation Permissions Implementations may add to Numerics.Random_Numbers.Extensions other facilities for manipulating states. A candidate for such a facility is an alternative to Value that constructs a state from data presented in a more convenient form than a string (for example, in the form of an array of integers or floating point numbers). By including parameters of the random number generation algorithm in its state information, an implementation can provide its users with a rudimentary way of altering the algorithm's parameters (e.g., for purposes of experimentation), given only the subprograms defined above. Nevertheless, implementations might wish to provide facilities (in the form of additional subprograms) for more conveniently altering those parameters. On the other hand, implementations may prefer not to offer any facility for altering the parameters of the generation algorithm, since it is not generally possible to guarantee properties of the algorithm given arbitrary parameters. I do not show in this LSN the updates for what is currently clause K.2.5, which discusses the performance requirements for the random number generator. These are straightforward. 2. Example Implementations ======= =============== I promised quite some time ago to post example implementations that meet the performance requirements of the Numerics Annex. Two are included in this section, one for a linear congruential generator and one for a subtract-with-borrow Fibonacci generator. Please note that I developed and tested my original implementations in Ada 83. The Ada 9X versions presented below have not been tested in Ada 9X and could well contain errors related to the shift of environments (as well as to the changes in packaging, which I've tried to accommodate but have not tested). The underlying algorithms are the same, however. Also, Image and Value were added to the implementation only recently and have not been tested at all. The implementation notes below refer only to the behavior of Random. In all cases, I have implemented Random_Integer in terms of Random. This was a late addition, and in fact it is not at the moment fully conforming, since it can raise Constraint_Error if the range Low .. High includes more than Integer'Last values (i.e., if High - Low + 1 exceeds Integer'Last). Though such wide result ranges probably do not occur in practice, it seems undesirable to legislate against them. My implementation uses the clever expression provided by Robert Eachus, but it does have the problem cited here with excessively wide ranges, and I haven't had time to search for a circumvention of that problem. 2.1. Linear Congruential Generator ====== ============ ========= The venerable multiplicative linear congruential generator with a multiplier of 16_807 (7**5), no additive constant, and a modulus of 2_147_483_647 (2**31-1) is presented below. This generator has a period of 2**31-2. It can NOT generate the value 0.0 (an LCG with an additive constant COULD generate 0.0), but (as written) it CAN generate the value 1.0 if 2_147_483_646.0 / 2_147_483_647.0 rounds up to 1.0 when the result is coerced to Float. Additional notes on modifying it to avoid generating 1.0 can be found later in this section. It is customary to perform integer arithmetic in this algorithm, using 32-bit integers, and convert to floating point just before the end. In the customary implementation, the state is an integer between 1 and 2**31-2. A new state is obtained from the current state by integer arithmetic operations (that in general require multiple precision arithmetic). The result is obtained by converting the new state to floating point and dividing by a constant (e.g., one less than the modulus) to normalize the range so that its maximum value is 1.0 (or by the modulus itself, or an even larger value, to give a maximum of less than 1.0). My implementation, like the one in the Boeing Computer Services library (I believe), does not use integer arithmetic and does not represent the state as a 32-bit integer. Instead, the state is a floating point number with a precision of at least 31 bits, so that the integers from 1.0 to 2_147_483_646.0 can all be represented exactly. The arithmetic that is performed to obtain the new state creates a product that is the floating point representation of an integer value requiring at most 46 bits of precision. Thus, customary 64-bit double precision hardware suffices. Resetting the state from an initiator has to allow all Integer values and has to map the initiator value into a state value in the range 1 .. 2**31-2. This is accomplished with an integer mod and an addition. I have chosen to generate and throw away five random numbers after setting the state so that the first random number called for by the user after setting the state with an initiator will not appear highly correlated with the initiator value. Resetting the state without an initiator computes a state using the current value of the system clock. The values obtained by Splitting the value returned by Clock are used in an appropriate mixed-radix computation to yield a state value between 26_000_700 and 1_873_119_999. This version of Reset yields different states for two calls spaced at least a second and not more than fifty years apart. The realization offered here ASSUMES that the type Float corresponds to double precision, as it does in Verdix implementations of Ada 83 (at least the ones I've used). This realization is the simplest. It uses the type Integer (ASSUMED to be at least 32 bits, again OK in Verdix implementations) for an intermediate calculation. The conversion to and from Integer is only incidental, not required; notes provided later discuss how it can be removed. One can imagine how the realization shown below can be made more portable, if desired, by removing the assumptions on Float and Integer, by declaring types with the required ranges and precisions, and by including the appropriate explicit conversions among these types. Such a realization would only require that the implementation HAVE predefined types with the required precision and ranges, and would not require that they be named Float and Integer. I won't amplify further on this here. Here is a realization of Numerics.Random_Numbers incorporating a linear congruential generator: with Ada.Finalization; use Ada.Finalization; package Ada.Numerics.Random_Numbers is type Generator is limited private; subtype Uniformly_Distributed is Float range 0.0 .. 1.0; function Random (Gen : Generator) return Uniformly_Distributed; function Random_Integer (Gen : Generator; Low, High : Integer) return Integer; procedure Reset (Gen : in Generator; Initiator : in Integer); procedure Reset (Gen : in Generator); private type Access_State is access Float; Initial_State : Float := 30_000_000.0; -- Fixed but arbitrary. type Generator is new Limited_Controlled with record State : Access_State := new Float'(Initial_State); end record; procedure Finalize (Gen : in out Generator); Modulus : constant := 2_147_483_647.0; -- 2**31-1 Modulus_M1 : constant := Modulus - 1.0; -- 2**31-2 -- These two named numbers are in the private part only because of the -- desire to use Modulus_M1 in the child package -- Ada.Numerics.Random_Numbers.Extensions, rather than repeat its value -- there. end Ada.Numerics.Random_Numbers; with Calendar; use Calendar; with Unchecked_Deallocation; package body Ada.Numerics.Random_Numbers is Multiplier : constant := 16_807.0: -- 7**5 Int_Mod_M1 : constant Integer := Integer(Modulus_M1); procedure Destroy_State is new Unchecked_Deallocation (Float, Access_State); function Random (Gen : Generator) return Uniformly_Distributed is T, U : Float; I : Integer; begin T := Gen.State.all * Multiplier; -- an integer of at most 46 bits I := Integer(T / Modulus); -- integer part of T / Modulus; may be -- too big by 1 (i.e., if rounded up) U := T - Float(I) * Modulus; -- T mod Modulus; may be too small by -- Modulus (i.e., if I is too big by 1), -- in which case it will be negative if U < 0.0 then U := U + Modulus; -- now T mod Modulus end if; Gen.State.all := U; return U / Modulus; end Random; function Random_Integer (Gen : Generator; Low, High : Integer) return Integer is Spread : Positive := High - Low + 1; -- Above propagates Constraint_Error if it overflows or if -- Low > High. begin return Low + Integer(Float(Spread) * Random(G)) mod Spread; end Random_Integer; procedure Reset (Gen : in Generator; Initiator : in Integer) is T : Float; begin Gen.State.all := Float(Initiator mod Int_Mod_M1 + 1); for J in 1 .. 5 loop T := Random(Gen); end loop; end Reset; procedure Reset (Gen : in Generator) is Yr : Year_Number; Mo : Month_Number; Dy : Day_Number; Se : Day_Duration; S : Natural range 0 .. 86_400; Sec : Natural range 0 .. 59; Min : Natural range 0 .. 59; Hr : Natural range 0 .. 23; T : Natural; begin Split(Clock, Yr, Mo, Dy, Se); S := Natural(Se); Sec := S mod 60; S := S / 60; Min := S mod 60; Hr := S / 60; T := ((((Sec*60 + Min)*24 + Hr)*32 + Dy)*13 + Mo)*50 + (Yr mod 50) + 26_000_000; Gen.State.all := Float(T); end Reset; procedure Finalize (Gen : Generator) is begin Destroy_State (Gen.State); end Finalize; end Ada.Numerics.Random_Numbers; The integer calculations can be omitted from Random in several ways. One can write its body as follows: T := Gen.State.all * Multiplier; U := T - Float'Floor(T / Modulus) * Modulus; if U < 0.0 then U := U + Modulus; end if; Gen.State.all := U; return U / Modulus; Unfortunately, the if-test on U is still required because T / Modulus can conceivably round up to an integer before its Floor is taken. (Some heavy numerical analysis might prove that this can't happen, however.) Alternatively, one can write, of course, U := Float'Remainder(Gen.State.all * Multiplier, Modulus); if U < 0.0 then U := U + Modulus; end if; Gen.State.all := U; return U / Modulus; but unless Float'Remainder is computed in hardware, this is unlikely to be as efficient as the previous alternatives. (The if-test in this case converts the remainder, which may be negative, to a mod. Note that the remainder cannot be zero.) One can guarantee that the value 1.0 will never be returned, and that the maximum return value will be any desired value less than 1.0, by appropriately increasing the divisor in the return statement of Random. In a portable implementation, that divisor can even be determined empirically with an appropriate computation in the statement sequence of Numerics.Random_Numbers. One could, if one chose, substitute Modulus_M1 for Modulus as the divisor in the return statement, which would GUARANTEE that 1.0 would be returned if sufficiently many calls are made on the Random. None of these perturbations of the code introduce any significant bias in the results, at least as measured by the craps tests. Here is a realization of the part of Numerics.Random_Numbers.Extensions that is required of all implementations supporting the Numerics Annex. The State type is just a copy of Float, so states can be saved and restored by simple assignments (with no-op explicit conversions). Since state values are floating point representations of integers, I have chosen to represent the image of a state as an integer image. This isn't necessary, of course, but it may be more convenient for the user, and it is certainly easy to implement. package Ada.Numerics.Random_Numbers.Extensions is type State is private; procedure Save (Gen : in Generator; To_State : out State); procedure Reset (Gen : in Generator; From_State : in State); Max_Image_Width : constant := Integer'Width; function Image (Of_State : State) return String; function Value (Coded_State : String) return State; private type State is new Float; end Ada.Numerics.Random_Numbers.Extensions; package body Ada.Numerics.Random_Numbers.Extensions is procedure Save (Gen : in Generator; To_State : out State) is begin To_State := State(Gen.State.all); end Save; procedure Reset (Gen : in Generator; From_State : in State) is begin Gen.State.all := Float(From_State); end Reset; function Image (Of_State : State) return String is begin return Integer'Image(Integer(Of_State)); end Image; function Value (Coded_State : String) return State is T : State range 1.0 .. Modulus_M1; begin T := State(Integer'Value(Coded_State)); -- Propagate Constraint_Error if raised by Integer'Value or by -- the constraint check in the assignment to T. The latter occurs -- when the integer in the Coded_State is outside the allowed -- range (min = 1; max = 2**31-2). return T; end Value; end Ada.Numerics.Random_Numbers.Extensions; 2.2. Subtract-With-Borrow Fibonacci Generator ==================== ========= ========= A subtract-with-borrow Fibonacci generator is realized below. The particular instance of this type of generator that I have chosen is one whose state is a vector of 25 floating point numbers (in fact, the 25 previous outputs) and a "borrow" value, which is also a floating point number. Each of the 25 numbers in the vector is a scaled integer, an integer in the range 0.0 .. 2.0**24-1.0 scaled down by (i.e., divided by) 2.0**24, as is the borrow value. The maximum output value is therefore just under 1.0; it is 1.0-2.0**(-24). Zero is a possible output value. The borrow value is always either 0.0 or 2.0**(-24). These numbers are all exactly representable in a floating point type with at least 24 bits of precision. The "next" output value is obtained by subtracting the 11th previous output value and the borrow value from the 25th previous output value, adding 1.0 if the result is negative. (Thus, this generator has "lags" of 25 and 11.) The result is again in the range 0.0 .. 1.0-2.0**(-24). The vector of 25 previous outputs is managed as a cyclical list, with a pair of indices identifying the 11th and 25th previous outputs. The new output value replaces the 25th (i.e., the oldest) previous output value in the list, becoming the youngest, and the two indices are adjusted (they are decremented, wrapping around when they reach the beginning of the list; modular arithmetic is used). A new borrow value is also computed; it is 2.0**(-24) if 1.0 was added in obtaining the new output value, and 0.0 otherwise. Different instances of this kind of generator have different lags and/or vectors of different sizes. There are also analogous add-with-carry Fibonacci generators. The realization shown here ASSUMES, once again, that Float has adequate precision. In this case, even single precision suffices (on IEEE machines). Note that neither the output value nor any intermediate result leading to it is ever held in Integer form. There is no conversion between Float and Integer at any point, and there are no multiplications or divisions. This generator is significantly faster than the linear congruential generator. Provided that the generator is not started from a state in which all the components of the vector, as well as the borrow value, are zero, or all the components of the vector are 1.0-2.0**(-24), and the borrow value is 2.0**(-24), both of which reproduce the same output indefinitely, the generator has a period of about 10**178. (Compare this to the period of the linear congruential generator given previously.) It is permissible to initialize all 25 vector components to the same value, and the borrow value arbitrarily, but that is unwise (the first generated output will be either 0.0 or 1.0-2.0**(-24), depending on the initial borrow value; this is not very random). Note that the initial values have to be among the set of permissible floating point values (the set of scaled integers described previously). One could write out the initial value as a suitably "random looking" static aggregate, but what I do is to compute an initial value (in the statement sequence of the package body), which requires that the body of Numerics.Random_Numbers be elaborated before any unit that uses the random number generator; this is achieved by including an Elaborate_Body pragma in the spec. The 25 vector components are built up bit by bit, randomly, using an internal linear congruential generator (based on the same algorithm as the one shown previously); the initial borrow is just set arbitrarily to zero. Here is a realization of Ada.Numerics.Random_Numbers incorporating the Fibonacci generator described above. with Ada.Finalization; use Ada.Finalization; package Ada.Numerics.Random_Numbers is pragma Elaborate_Body; type Generator is limited private; subtype Uniformly_Distributed is Float range 0.0 .. 1.0; function Random (Gen : Generator) return Uniformly_Distributed; function Random_Integer (Gen : Generator; Low, High : Integer) return Integer; procedure Reset (Gen : in Generator; Initiator : in Integer); procedure Reset (Gen : in Generator); private Larger_Lag : constant := 25; Smaller_Lag : constant := 11; type Lag_Range is mod Larger_Lag; type State_Vector is array (Lag_Range) of Float; type Internal_State is record Lagged_Outputs : State_Vector; Borrow : Float; R, S : Lag_Range; end record; type Access_State is access Internal_State; Initial_State : Internal_State; type Generator is new Limited_Controlled with record State : Access_State := new Internal_State'(Initial_State); end record; procedure Finalize (Gen : in out Generator); end Ada.Numerics.Random_Numbers; with Calendar; use Calendar; with Unchecked_Deallocation; package body Ada.Numerics.Random_Numbers is procedure Destroy_State is new Unchecked_Deallocation (Internal_State, Access_State); -- The following function is used in this implementation to produce -- a valid internal state for the Fibonacci generator based on an -- integer that is a valid internal state for a linear congruential -- generator. It uses the latter to generate random bits with which -- to initialize the state vector. function Make_Internal_State (Starter : Integer) return Internal_State is Bit_Value : Float; T : State_Vector; LCG_State : Float; LCG_Multiplier : constant := 16_807.0; LCG_Modulus : constant := 2_147_483_647.0; function LCG_Random return Uniformly_Distributed is T : Float; I : Integer; begin T := LCG_State * LCG_Multiplier; I := Integer(T / LCG_Modulus); LCG_State := T - Float(I) * LCG_Modulus; if LCG_State < 0.0 then LCG_State := LCG_State + LCG_Modulus; end if; return LCG_State / LCG_Modulus; end LCG_Random; begin LCG_State := Starter; for I in Lag_Range loop State_Vector(I) := 0.0; Bit_Value := 1.0; for J in 1 .. 24 loop Bit_Value := Bit_Value * 0.5; if LCG_Random >= 0.5 then State_Vector(I) := State_Vector(I) + Bit_Value; end if; end loop; end loop; return (Lagged_Outputs => State_Vector, Borrow => 0.0, -- arbitrary R => Larger_Lag - 1, S => Smaller_Lag - 1); end Make_Internal_State; function Random (Gen : Generator) return Uniformly_Distributed is U : Float; begin U := Gen.State.Lagged_Outputs(Gen.State.R) - Gen.State.Lagged_Outputs(Gen.State.S) - Gen.State.Borrow; if U < 0.0 then U := U + 1.0; Gen.State.Borrow := 2#1.0#e-24; else Gen.State.Borrow := 0.0; end if; Gen.State.Lagged_Outputs(Gen.State.R) := U; Gen.State.R := Gen.State.R - 1; Gen.State.S := Gen.State.S - 1; return U; end Random; function Random_Integer (Gen : Generator; Low, High : Integer) return Integer is Spread : Positive := High - Low + 1; -- Above propagates Constraint_Error if it overflows or if -- Low > High. begin return Low + Integer(Float(Spread) * Random(G)) mod Spread; end Random_Integer; procedure Reset (Gen : in Generator; Initiator : in Integer) is begin Gen.State.all := Make_Internal_State(Initiator mod Int_Mod_M1 + 1); end Reset; procedure Reset (Gen : in Generator) is Yr : Year_Number; Mo : Month_Number; Dy : Day_Number; Se : Day_Duration; S : Natural range 0 .. 86_400; Sec : Natural range 0 .. 59; Min : Natural range 0 .. 59; Hr : Natural range 0 .. 23; T : Natural; begin Split(Clock, Yr, Mo, Dy, Se); S := Natural(Se); Sec := S mod 60; S := S / 60; Min := S mod 60; Hr := S / 60; T := ((((Sec*60 + Min)*24 + Hr)*32 + Dy)*13 + Mo)*50 + (Yr mod 50) + 26_000_000; Gen.State.all := Make_Internal_State(T); end Reset; procedure Finalize (Gen : in out Generator) is begin Destroy_State (Gen.State); end Finalize; begin Initial_State := Make_Internal_State(30_000_000); end Ada.Numerics.Random_Numbers; Here is a realization of the part of Numerics.Random_Numbers.Extensions that is required of all implementations supporting the Numerics Annex. The State type is just a copy of Internal_State, so states can be saved and restored by simple assignments (with no-op explicit conversions). The image of a state is the concatenation of several components separated by commas. In order from left to right, the components represent the oldest to the youngest components of the state vector, followed by the borrow value. Since the components of a state vector are scaled integers, as is a borrow value, I have chosen to represent each component as an integer image. Because of the ordering of state components in the image of a state, the R and S components of a state do not need to be recorded in the image of the state. Note that the oldest component of the state vector is the one whose index is R; younger components have "smaller" indices, which wrap around after reaching zero, so that the youngest component has index R+1. Ada.Strings.Bounded is used in the manipulation of state images. package Ada.Numerics.Random_Numbers.Extensions is type State is private; procedure Save (Gen : in Generator; To_State : out State); procedure Reset (Gen : in Generator; From_State : in State); Max_Image_Width : constant := (Larger_Lag + 1)*(Integer'Width + 1) - 1; function Image (Of_State : State) return String; function Value (Coded_State : String) return State; private type State is new Internal_State; end Ada.Numerics.Random_Numbers.Extensions; with Ada.Strings.Bounded; use Ada.Strings.Bounded; package body Ada.Numerics.Random_Numbers.Extensions is package Bounded_Length_Strings is new Generic_Bounded_Length (Max_Image_Width + 1); use Bounded_Length_Strings; procedure Save (Gen : in Generator; To_State : out State) is begin To_State := State(Gen.State.all); end Save; procedure Reset (Gen : in Generator; From_State : in State) is begin Gen.State.all := Internal_State(From_State); end Reset; function Image (Of_State : State) return String is Result : Bounded_String; function Encode (Value : Float) return String is begin return Integer'Image(Integer(2#1.0#e24 * Value)); end Encode; begin Result := Null_Bounded_String; for I in Lag_Range loop Result := Result & Encode(Of_State.Lagged_Outputs(Of_State.R - I)) & ','; end loop; Result := Result & Encode(Of_State.Borrow); return To_String(Result); end Image; function Value (Coded_State : String) return State is use Ada.Strings; Result : State; T : Bounded_String; procedure Decode_Component (Max : in Natural; Component : out Float) is J : Natural; K : Natural range 0 .. Max; begin J := Index(T, ','); if J = 0 then -- Coded_State has too few commas raise Constraint_Error; -- to be a valid state. end if; K := Integer'Value(Slice(T, 1, J-1)); -- Propagate Constraint_Error if raised by Integer'Value or by the -- constraint check in the assignment to K. The latter occurs -- when the integer in a component of the Coded_State is outside -- the allowed range (min = 0; max = 2e24-1 for a component of the -- state vector; max = 1 for the borrow component). Component := Float(K) * 2#1.0#e-24; Delete(T, 1, J); end Decode_Component; begin begin T := To_Bounded_String(Trim(Coded_State) & ','); exception when Length_Error => -- The trimmed Coded_State is too long raise Constraint_Error; -- to be a valid state. end; for I in reverse Lag_Range loop Decode_Component(2e24-1, Result.Lagged_Outputs(I)); end loop; -- If Coded_State is well formed, exactly one comma remains in T -- (the one tacked on at the end when T was initialized). Decode_Component(1, Result.Borrow); -- If Coded_State is well formed, the above consumes the rest of T. if T /= Null_Bounded_String then -- Coded_State has too many commas raise Constraint_Error; -- to be a valid state. end if; Result.R := Larger_Lag - 1; Result.S := Smaller_Lag - 1; return Result; end Value; end Ada.Numerics.Random_Numbers.Extensions; 3. Discussion of Major Alternatives ========== == ===== ============ I begin this section by discussing the rationale for the default position. It is widely accepted that the majority of applications of random numbers will, in fact, want random integers. An integer generator (Random_Integer) is provided not because it is so difficult to provide this functionality otherwise (i.e., if one only has the floating point generator, Random) but because of the desire to make it extremely convenient for the user to access this functionality. The significant majority of the comments on ada9x-mrt about this issue have favored the inclusion of an integer generator. The issue of the range of the floating point generator, Random--in particular, whether it should exclude 1.0 (and, if so, by how much)--is mostly a separate issue; however, the presence or absence of an integer generator does have a bearing on the importance of the issue. I had originally sought uniformity with Fortran's informal statement that the random number generator never delivers 1.0, but there have been repeated suggestions that that alone is not good enough. Some numbers close to 1.0 might behave in some contexts just like 1.0; indeed, a number slightly under 1.0 might be delivered as a value in an extended register, which subsequently rounds up to 1.0 if it has to be shortened to the storage format. I corresponded with Stuart Anderson of Boeing, author of survey articles on random number generation, as early as November 1992 about the conflict between the desire to avoid 1.0 and the compromises that are required to do so. I have discussed on ada9x-mrt the techniques that can be used to guarantee that the result remains less than 1.0 (and behaves like a value less than 1.0), but they involve a varying degree of compromise. In the case of a linear congruential generator, there is no cost in time, because one just replaces the divisor in the final step with a slightly larger divisor; however, the Fibonacci generator normally performs no such division, so including one (if it is required) represents a cost overhead. There have also been concerns that avoiding the vicinity of 1.0 might bias the distribution somewhat, but I believe that to be unobservable in practice. Then there were discussions about how the vicinity of 1.0 should be avoided (if it is agreed that it should be): by reducing the upper bound of Uniformly_Distributed, or by English phrases in the description of Random? The most recent suggestion prior to this LSN was to state that Random stayed sufficiently below 1.0 for certain expressions to "work" reliably; one of these was involved with converting between floating point random numbers and random integers, while the other was involved with obtaining exponentially distributed random numbers. Indeed, the desire to stay away from 1.0 was motivated largely by the problems it posed for converting to random integers. Some of the techniques that one can dream up do indeed fail if Random returns 1.0 or a number "sufficiently close" to 1.0. But now that the user does not have to do that conversion, the problem of returning 1.0 seems less important than before, in relation to the compromises inherent in solving the problem. The semantic description of Random therefore makes no statement about whether Random can return 1.0, and, since 1.0 is in the range of the return subtype, the reader must conclude that 1.0 is a possible result (same for 0.0). This fact is underscored in the Notes at the end of subclause C.5.2, which goes a step further to alert the user to the problems that one or the other of the extreme values poses for transforming the uniform distribution into an exponential distribution. The user is given the necessary information to anticipate problems, with hints for their solution, while at the same time the frequency of the potential problems is greatly reduced by the provision of Random_Integer. This position regarding the range of Random allows a wider range of natural implementations. I now discuss possible alternatives to the default position. As I said above, the issue of the range of Random can be treated separately from that of an integer generator. While retaining Random_Integer, one can still adopt an alternative with regard to the range of Random. Among the viable alternatives that we might choose are the following (I do not repeat all their advantages and disadvantes here, since they have been cited often enough; I merely list a menu of choices, with an occasional comment): o Keep the range as it is now (the endpoints, 0.0 and 1.0, are possible return values). o Reduce the upper bound of Uniformly_Distributed to 1.0 - Float'Model_Epsilon. (This "looks awkward.") o Leave Uniformly_Distributed alone, and go back to the Fortran rule ("never returns 1.0"). (Is this really any help?) Note that the provision of Random_Integer eliminates the motivation for my previous position (that is, that Random returns a value sufficiently less than 1.0 for certain conversion strategies to work correctly). However, even if Random_Integer should be taken back out, there is no need to reconsider the alternative represented by my previous position, because we now have a simple, portable, reliable conversion technique (the one I actually used in the implementations of Random_Integer in sections 2.1 and 2.2, credited to Robert Eachus) that does not depend on Random's avoidance of 1.0. The provision of Random_Integer is certainly not without its compromises and warts. We might still want to consider whether it is worth having, or, if it is agreed that it is, whether it should have a different form and/or different semantics. I do not yet know how to make an implementation of Random_Integer general enough to accommodate any integer range, no matter how wide (for example, the range Integer'First .. Integer'Last). There is no difficulty with its implementation when the range is more modest, as I discussed and demonstrated above. This problem hints at a broader issue, that of the weakness inherent in assuming that Random_Integer should be implemented in terms of Random. I return to this subject later. One can argue that providing Random_Integer as a nongeneric function is not very Ada-like. It would certainly be more in the spirit of Ada to make Random_Integer generic with respect to the integer subtype of its result range. This would make it unnecessary to pass the bounds on every call of the function. One could go further and allow any discrete subtype (the function should be named Random_Discrete in that case), so that one can conveniently obtain random characters, random Boolean values, etc. I have avoided doing this in the default recommendation only because of a strong desire to keep the entire random number facility simple enough for the Fortran convert to understand how to use it without having to learn new language concepts (like generic instantiation). We might still decide that holding on to that goal imposes too heavy a compromise on the "Ada-ness" of this part of the language. When there was only Random, one could view objects of type Generator in terms of a high-quality abstraction: they were associated with a sequence of random numbers and had a state, which effectively captured the current position in that sequence. Random was the operation that advanced the state and delivered the next number from the sequence; other operations saved and restored states, etc. This notion is weakened somewhat by the inclusion of Random_Integer, at least if it shares the Generator type with Random. Is the sequence of random numbers a floating point sequence or an integer sequence? One has to say that it is both at the same time, and the two sequences correspond; one can ask either for the floating point form of the next value in the sequence, or the corresponding integer. But what integer corresponds to a given floating point member of the sequence? That in turn depends on the range specified at the time it is requested. It gets even muddier if one generalizes a generic Random_Integer into Random_Discrete. I believe that the quality of the abstraction can be upheld if the generator type is made a product of the generic instantiation, along with all the operations on the type. This clearly escalates the complexity, however, which is why I did not venture this far. The form in which Random_Integer has been conceived has been recommended by several people, including some on the MRT. The problem with the weakening of the abstraction is even evident in the confusion between viewing the type Generator as embodying the actual generator (in an OO-like fashion) and then referring to Random_Integer as an "integer generator," as I have done repeatedly in this LSN (but fortunately not in the citations from the RM). It is not too hard to imagine that an integer generator should be implemented very differently from a floating point generator, suggesting that they should indeed have distinct generator types (and distinct state types in the child package). One might not want to use a floating point internal representation for the former. Distinctly different implementation strategies would probably ameliorate the problem with excessively wide integer ranges that I discussed above. Summarizing, here is a fairly complete menu of alternatives with respect to the integer generator: o Keep it just as is. o Add a restriction on the width of the range that eases the burden on the implementor and justifies the implementation I showed. (This would be a wart on the language, however.) o Make Random_Integer generic on an integer subtype. o Make it generic on a discrete subtype and call it Random_Discrete. (Though I haven't said so previously, it would probably be a good idea to change Random to Random_Real or Random_Float in that case.) o If made generic, make the generic unit a generic package that defines the generator type and all the operations of the type. o Abandon Random_Integer and just have Random. In view of Robert's clever expression, this need not be a reason to go back to a position in which Random avoids returning 1.0. However, in this case, I would expand the last Note at the end of subclause C.5.2 so that it reads as follows: Whether Random can actually deliver any particular value in its range is implementation dependent. Applications should assume that the bounds of the range, or values sufficiently close to them to behave indistinguishably from them, can occur. Exponentially distributed (floating point) random numbers with mean and standard deviation 1.0 can be obtained by evaluating either the expression -Log(Random(G)) or the expression -Log(1.0 - Random(G)). Note that Log raises Numerics.Argument_Error when its parameter is zero, so the expression should be computed as the return expression of a function that has a handler for Argument_Error; the handler should return Float'Last, or some other large value appropriate to the application. The potential problems that the value 1.0 poses for the conversion of the output of Random into the integer range 0 .. N-1, for N > 0, can be avoided by using the expression Integer(Float(N) * Random(G)) mod N to perform this transformation; this expression works reliably even when Random delivers 1.0. However, I would say no more about "wide" integer ranges. The user who tries to express a very wide range by subtracting a term from that expression to shift both bounds left, and then returning the upper bound to its former position by increasing N by the same value, would risk producing a subexpression that overflows, but at least the cause of the overflow would be patently obvious in the user's program (rather than hidden away in a subprogram in a library). In any case, I should stress again that I do not think such wide ranges are common in practice, so leaving the problem unaddressed is probably realistic. 4. A Test Program that Implements the Craps Tests = ==== ======= ==== ========== === ===== ===== In this section I include a complete, working program (promised some time ago) to implement the craps tests described in RM9X-K.2.5;4.0. I wrote it initially several months ago (for Ada 83), but I have revised it to be a little more sophisticated, and of course to include the correct context clause for Ada 9X. This program plays a million craps games, using random numbers to simulate the rolls of a pair of dice, and counts and reports various occurrences of things. Chi-square statistics are computed for the occurrences and compared to tabulated values. Statistics are reported for the distribution of wins and losses, distribution of outcomes of the roll of a single die (equidistribution test), distribution of game lengths, and distribution of pass lengths (number of consecutive games won before a loss). The program generates a different sequence of random numbers each time it is run. Therefore, its behavior is not repeatable. To obtain repeatable behavior, comment out the call to Reset. On a Sequent Symmetry (Intel 80386/387), the program takes about 15 minutes to run with a linear congruential generator and about 11 1/2 minutes to run with a subtract-with-borrow Fibonacci generator. It prints a progress message after every 50,000 games. For experimentation, one can reduce the number of games to, say, 100,000. The tail of the distributions of game lengths and pass lengths become too small to be statistically meaningful with so "few" (!) games played, but the program has been adapted to lump observations in the tail of the distribution to maintain statistical significance. The program expects no input (the number of games to be played really should be an input parameter) and writes its output on standard output. An example of the output for a subtract-with-borrow Fibonacci generator follows: Length Probability of a game of the given length 1 0.3333333 2 0.1882716 3 0.1347737 4 0.0965673 5 0.0692571 6 0.0497177 7 0.0357251 8 0.0256954 9 0.0184993 10 0.0133315 11 0.0096166 12 0.0069437 13 0.0050186 14 0.0036307 15 0.0026292 16 0.0019058 17 0.0013827 18 0.0010041 19 0.0007299 20 0.0005311 > 20 0.0014356 Length Probability of a pass of the given length (# of wins) 0 0.50707071 1 0.24995001 2 0.12320768 3 0.06073267 4 0.02993691 5 0.01475678 6 0.00727405 7 0.00358559 8 0.00176744 9 0.00087122 10 0.00042945 11 0.00021169 12 0.00010435 13 0.00005144 14 0.00002535 > 14 0.00002465 Starting to play 1000000 craps games Played 50000 games Played 100000 games Played 150000 games Played 200000 games Played 250000 games Played 300000 games Played 350000 games Played 400000 games Played 450000 games Played 500000 games Played 550000 games Played 600000 games Played 650000 games Played 700000 games Played 750000 games Played 800000 games Played 850000 games Played 900000 games Played 950000 games Played 1000000 games Number of rolls of a single die: 6763638 Number of craps games played: 1000000 Number of passes counted: 506923 Distribution of wins and losses Outcome Times expected Times observed Win 492929.3 493077 Loss 507070.7 506923 Calculated chi-square for distribution of wins and losses: 0.087 Tabulated value for 1 degree of freedom (alpha=0.05): 3.841 The generator PASSES this test Distribution of dice rolls Roll Times expected Times observed 1 1127273.0 1127746 2 1127273.0 1128403 3 1127273.0 1128030 4 1127273.0 1125209 5 1127273.0 1126770 6 1127273.0 1127480 Calculated chi-square for distribution of dice rolls: 5.881 Tabulated value for 5 degrees of freedom (alpha=0.05): 11.070 The generator PASSES this test Distribution of game lengths Length Times expected Times observed 1 333333.3 332785 2 188271.6 188264 3 134773.7 134725 4 96567.3 95946 5 69257.1 69560 6 49717.7 50193 7 35725.1 35825 8 25695.4 25792 9 18499.3 18505 10 13331.5 13366 11 9616.6 9583 12 6943.7 7068 13 5018.6 5072 14 3630.7 3675 15 2629.2 2546 16 1905.8 1991 17 1382.7 1400 18 1004.1 1022 19 730.9 713 20 531.1 486 > 20 1435.6 1483 Calculated chi-square for distribution of game lengths: 27.735 Tabulated value for 20 degrees of freedom (alpha=0.05): 31.410 The generator PASSES this test Distribution of pass lengths Length Times expected Times observed 0 257045.8 256843 1 126705.4 126665 2 62456.8 62748 3 30786.8 30847 4 15175.7 15111 5 7480.6 7438 6 3687.4 3660 7 1817.6 1808 8 896.0 922 9 441.6 465 10 217.7 205 11 107.3 119 12 52.9 42 13 26.1 25 14 12.9 14 > 14 12.5 11 Calculated chi-square for distribution of pass lengths: 8.997 Tabulated value for 15 degrees of freedom (alpha=0.05): 24.996 The generator PASSES this test The generator PASSED all the tests In other runs, the four calculated chi-square values were as follows (the preceding run is relisted as run 1 below, and the tabulated values are listed in the last row): Run Wins & losses Dice rolls Game lengths Pass lengths 1 0.087 5.881 27.735 8.997 2 0.000 3.982 14.858 13.997 3 0.000 5.745 27.386 19.930 4 0.227 7.285 20.860 10.251 5 1.895 2.814 25.102 27.773 table 3.841 11.070 31.410 24.996 It is evident that run 5 did not pass the pass lengths test. It is in the nature of this kind of statistic that the tabulated chi-square value will be exceeded occasionally. The game-length and pass-length tests are extremely demanding and are capable of revealing subtle long-term correlations in the random numbers (apparently). Here are the results of five runs made with a subtract-with-borrow Fibonacci generator employing a state vector of 24 components and lags of 24 and 10 (another combination reported in the literature to be "good"). This generator fails more often, including once with respect to wins & losses. Run Wins & losses Dice rolls Game lengths Pass lengths 1 1.198 3.494 37.722 18.782 2 0.331 2.101 24.552 31.410 3 0.034 2.028 41.908 20.283 4 0.877 3.097 31.231 22.611 5 4.536 2.677 27.444 20.360 table 3.841 11.070 31.410 24.996 Here are the results of five runs made with a subtract-with-borrow Fibonacci generator employing a state vector of 28 components and lags of 28 and 8 (another combination reported in the literature to be "good"). This generator also has occasional failures. Run Wins & losses Dice rolls Game lengths Pass lengths 1 0.017 3.124 41.214 19.604 2 1.181 4.587 23.858 15.410 3 0.115 3.368 16.370 7.436 4 1.718 1.698 39.358 16.336 6 0.117 3.725 15.407 15.600 table 3.841 11.070 31.410 24.996 The final subtract-with-borrow Fibonacci generator from the literature that I tested has a state vector of 39 components and lags of 39 and 25. Here are its results, which also exhibit occasional failures. Run Wins & losses Dice rolls Game lengths Pass lengths 1 0.031 5.952 21.127 8.658 2 0.744 4.329 34.678 6.455 3 0.026 3.916 20.979 10.745 4 0.004 11.401 19.599 21.728 5 0.214 5.610 20.530 18.542 table 3.841 11.070 31.410 24.996 Finally, here are the results of five runs of the linear congruential generator whose implementation was shown above. No failures occurred for this series of runs. Run Wins & losses Dice rolls Game lengths Pass lengths 1 0.638 0.816 28.499 9.378 2 0.528 4.066 27.787 11.224 3 0.005 6.278 17.311 20.863 4 0.125 2.108 10.165 12.300 5 0.245 4.098 15.073 12.231 table 3.841 11.070 31.410 24.996 The complete code for Craps_Tests follows: with Ada.Numerics.Random_Numbers; use Ada.Numerics.Random_Numbers; with Text_IO; use Text_IO; procedure Craps_Tests is -- This test program written by Ken Dritz, Argonne National Laboratory. -- Adapted January 30, 1994 from earlier versions. -- Except for the first context clause above, this program is written -- entirely in Ada 83. It does not use the Max and Min attributes where -- it could, and it does not use the Random_Integer function to represent -- dice rolls (but rather converts the output of Random to an integer value -- in the range 1 .. 6). The only subprograms used from Random_Numbers are -- Random and Reset (without an initiator). -- The floating point type used throughout this program is Float. It has -- been verified that the program is not sensitive to whether Float -- corresponds to double or single precision. -- Rules of craps: -- -- If the player rolls 7 or 11 on the first roll (of a pair of dice), the -- player wins. -- -- If the player rolls 2, 3, or 12 on the first roll, the player loses. -- -- Otherwise, call the number showing on the first roll of the dice the -- "point." The player continues to roll. If the player rolls the point -- again before rolling a 7, the player wins (otherwise the player loses). subtype Roll_Outcome is Integer range 1 .. 6; subtype Pair_Outcome is Integer range 2 .. 12; Games_Played : Natural := 1000000; -- The number of games to be played is at least the initial value of -- Games_Played. After they are played, if the last game was not a loss, -- a few more games are played (until a loss occurs), thus completing the -- final "pass" (series of wins ending in a loss). Passes_Counted : Natural := 0; G : Generator; -- The following are the probabilities of winning and losing a single game -- of craps. Prob_Of_Win : constant := 244.0 / 495.0; Prob_Of_Loss : constant := 251.0 / 495.0; -- The following are the probabilities of the outcome of each roll of a -- pair of dice. Prob_Of_Pair : constant array (Pair_Outcome) of Float := (1.0/36.0, 2.0/36.0, 3.0/36.0, 4.0/36.0, 5.0/36.0, 6.0/36.0, 5.0/36.0, 4.0/36.0, 3.0/36.0, 2.0/36.0, 1.0/36.0); -- The following are the probabilities that a game has length i, for each -- i from 1 to 20, or a length greater than or equal to 21. (This array -- is initialized by code later.) Thus, the expected number of games of -- length i in N games played is N * Prob_Of_Game_Of_Length(i). Prob_Of_Game_Of_Length : array (1 .. 21) of Float; -- The following are the probabilities that a pass has length i (i.e., that -- it consists of i wins followed by a loss), for each i from 0 to 14, or a -- length greater than or equal to 15. (This array is initialized by code -- later.) Thus, the expected number of passes of length i in N passes -- counted is N * Prob_Of_Pass_Of_Length(i). Prob_Of_Pass_Of_Length : array (0 .. 15) of Float; Number_Of_Rolls : Natural := 0; Pass_Length : Natural := 0; type Expected_Counts is array (Integer range <>) of Float; type Observed_Counts is array (Integer range <>) of Natural; Expected_Wins_And_Losses : Expected_Counts (1 .. 2); Observed_Wins_And_Losses : Observed_Counts (1 .. 2) := (others => 0); Expected_Rolls_Of_Kind : Expected_Counts (Roll_Outcome); Observed_Rolls_Of_Kind : Observed_Counts (Roll_Outcome) := (others => 0); Expected_Games_Of_Length : Expected_Counts (1 .. 21); Observed_Games_Of_Length : Observed_Counts (1 .. 21) := (others => 0); Expected_Passes_Of_Length : Expected_Counts (0 .. 15); Observed_Passes_Of_Length : Observed_Counts (0 .. 15) := (others => 0); Win_Or_Loss : array (1 .. 2) of String (1 .. 5) := (" Win", " Loss"); Float_Games : Float; Float_Passes : Float; Passes_All_Tests : Boolean := True; package My_Int_IO is new Integer_IO (Integer); use My_Int_IO; package My_Float_IO is new Float_IO (Float); use My_Float_IO; function Roll return Roll_Outcome is Outcome : Roll_Outcome; begin Number_Of_Rolls := Number_Of_Rolls + 1; Outcome := 1 + Integer(6.0*Random(G)) mod 6; Observed_Rolls_Of_Kind(Outcome) := Observed_Rolls_Of_Kind(Outcome) + 1; return Outcome; end; procedure Record_Pass is begin if Pass_Length > 15 then Pass_Length := 15; end if; Observed_Passes_Of_Length(Pass_Length) := Observed_Passes_Of_Length(Pass_Length) + 1; Passes_Counted := Passes_Counted + 1; Pass_Length := 0; end; procedure Record_Win is begin Observed_Wins_And_Losses(1) := Observed_Wins_And_Losses(1) + 1; Pass_Length := Pass_Length + 1; end; procedure Record_Loss is begin Observed_Wins_And_Losses(2) := Observed_Wins_And_Losses(2) + 1; Record_Pass; end; procedure Play_Crap_Game is P1, P2 : Pair_Outcome; Game_Length : Natural := 1; begin P1 := Roll + Roll; if P1 = 7 or P1 = 11 then Record_Win; elsif P1 < 4 or P1 = 12 then Record_Loss; else loop P2 := Roll + Roll; Game_Length := Game_Length + 1; if P2 = P1 then Record_Win; exit; elsif P2 = 7 then Record_Loss; exit; end if; end loop; end if; if Game_Length > 21 then Game_Length := 21; end if; Observed_Games_Of_Length(Game_Length) := Observed_Games_Of_Length(Game_Length) + 1; end; procedure Print_Chi_Square (Expected : in out Expected_Counts; Observed : in out Observed_Counts; Title : in String; Col_Head : in String) is Chi_Square : Float := 0.0; Last_Row : Natural range Expected'Range; Degrees : Natural; Chi_Table : constant array (1 .. 30) of Float := ( 3.841, 5.991, 7.815, 9.488, 11.070, 12.592, 14.067, 15.507, 16.919, 18.307, 19.675, 21.026, 22.362, 23.685, 24.996, 26.296, 27.587, 28.869, 30.144, 31.410, 32.671, 33.924, 35.172, 36.415, 37.652, 38.885, 40.113, 41.337, 42.557, 43.773); begin -- The expected counts are either all about the same (and sufficiently -- large), or the ones near the end of the array approach zero. Make -- sure that the expected counts near the end of the array are at least -- 5.0 (for meaningful statistics). Do this by iteratively combining -- rows at the end of the arrays, effectively shortening the array, -- until the condition is achieved. Last_Row := Expected'Last; while Expected(Last_Row) < 5.0 loop Expected(Last_Row - 1) := Expected(Last_Row - 1) + Expected(Last_Row); Observed(Last_Row - 1) := Observed(Last_Row - 1) + Observed(Last_Row); Last_Row := Last_Row - 1; end loop; Degrees := Last_Row - Expected'First; -- Now print the expected vs. observed counts, simultaneously calculating -- the chi square value. Put ("Distribution of "); Put_Line (Title); Put (Col_Head); Put_Line (" Times expected Times observed"); for I in Expected'First .. Last_Row loop if Col_Head = "Outcome" then Put (Win_Or_Loss(I)); else if I < Last_Row or Col_Head = "Roll" then Put (I, Width => 1 + Col_Head'Length/2); else Put ('>'); Put (I-1, Width => Col_Head'Length/2); end if; end if; Set_Col (Col_Head'Length + 4); Put (Expected(I), Fore => 9, Aft => 1, Exp => 0); Set_Col (Col_Head'Length + 21); Put (Observed(I), Width => 10); New_Line; Chi_Square := Chi_Square + (Expected(I) - Float(Observed(I)))**2 / Expected(I); end loop; Put ("Calculated chi-square for distribution of "); Put (Title); Put (": "); Set_Col (62); Put (Chi_Square, Fore => 3, Aft => 3, Exp => 0); New_Line; Put ("Tabulated value for "); Put (Degrees, Width => 1); Put (" degree"); if Degrees > 1 then Put ('s'); end if; Put (" of freedom (alpha=0.05): "); Set_Col (62); Put (Chi_Table(Degrees), Fore => 3, Aft => 3, Exp => 0); New_Line; Put ("The generator "); if Chi_Square <= Chi_Table(Degrees) then Put ("PASSES"); else Put ("FAILS"); Passes_All_Tests := False; end if; Put_Line (" this test"); New_Line; end; begin -- Initialize and print the Prob_Of_Game_Of_Length array. -- The probability that a game will have a length of 1 is the probability -- of rolling a 7 (win), 11 (win), 2 (loss), 3 (loss), or 12 (loss) on the -- first roll. Prob_Of_Game_Of_Length(1) := Prob_Of_Pair(7) + Prob_Of_Pair(11) + Prob_Of_Pair(2) + Prob_Of_Pair(3) + Prob_Of_Pair(12); Prob_Of_Game_Of_Length(21) := 1.0 - Prob_Of_Game_Of_Length(1); -- Prob_Of_Game_Of_Length(21) will be reduced by the probability of a game -- of each other specific length computed. -- For I > 1, the probability that a game will have a length of I is the -- probability of neither a win nor a loss on the first I-1 rolls times the -- probability of either a win or a loss on the Ith roll. for I in 2 .. 20 loop Prob_Of_Game_Of_Length(I) := 0.0; for Point in 4 .. 10 loop if Point /= 7 then Prob_Of_Game_Of_Length(I) := Prob_Of_Game_Of_Length(I) + Prob_Of_Pair(Point) * (1.0 - (Prob_Of_Pair(Point) + Prob_Of_Pair(7))) ** (I - 2) * (Prob_Of_Pair(Point) + Prob_Of_Pair(7)); end if; end loop; Prob_Of_Game_Of_Length(21) := Prob_Of_Game_Of_Length(21) - Prob_Of_Game_Of_Length(I); end loop; Put_Line ("Length Probability of a game of the given length"); for I in 1 .. 21 loop if I < 21 then Put (I, Width => 4); else Put ("> 20"); end if; Put (Prob_Of_Game_Of_Length(I), Fore => 22, Aft => 7, Exp => 0); New_Line; end loop; New_Line; -- Initialize and print the Prob_Of_Pass_Of_Length array. -- The probability that a pass will have a length of I is the probability -- of winning I games in a row times the probability of losing (in the -- I+1st game). Prob_Of_Pass_Of_Length(15) := 1.0; -- Prob_Of_Pass_Of_Length(15) will be reduced by the probability of a pass -- of each other specific length computed. for I in 0 .. 14 loop Prob_Of_Pass_Of_Length(I) := Prob_Of_Win**I * Prob_Of_Loss; Prob_Of_Pass_Of_Length(15) := Prob_Of_Pass_Of_Length(15) - Prob_Of_Pass_Of_Length(I); end loop; Put_Line ("Length Probability of a pass of the given length (# of wins)"); for I in 0 .. 15 loop if I < 15 then Put (I, Width => 4); else Put ("> 14"); end if; Put (Prob_Of_Pass_Of_Length(I), Fore => 28, Aft => 8, Exp => 0); New_Line; end loop; New_Line; -- Reset the generator to a time-dependent state, providing a unique -- sequence of random numbers in each run. Reset (G); -- Play the minimum number of games. Put ("Starting to play "); Put (Games_Played, Width => 1); Put_Line (" craps games"); New_Line; for I in 1 .. Games_Played loop Play_Crap_Game; if I mod 50000 = 0 then Put ("Played "); Put (I, Width => 7); Put_Line (" games"); end if; end loop; New_Line; -- Play a few more, if necessary, to complete the current "pass." while Pass_Length > 0 loop Play_Crap_Game; Games_Played := Games_Played + 1; end loop; Float_Games := Float(Games_Played); Float_Passes := Float(Passes_Counted); -- Initialize expected values needed for the statistics. Expected_Wins_And_Losses(1) := Prob_Of_Win * Float_Games; Expected_Wins_And_Losses(2) := Prob_Of_Loss * Float_Games; for I in Roll_Outcome loop Expected_Rolls_Of_Kind(I) := Float(Number_Of_Rolls) / 6.0; end loop; for I in 1 .. 21 loop Expected_Games_Of_Length(I) := Prob_Of_Game_Of_Length(I) * Float_Games; end loop; for I in 0 .. 15 loop Expected_Passes_Of_Length(I) := Prob_Of_Pass_Of_Length(I) * Float_Passes; end loop; -- Now compute and print the statistics. Put ("Number of rolls of a single die: "); Put (Number_Of_Rolls, Width => 1); New_Line; Put ("Number of craps games played: "); Put (Games_Played, Width => 1); New_Line; Put ("Number of passes counted: "); Put (Passes_Counted, Width => 1); New_Line (2); Print_Chi_Square(Expected_Wins_And_Losses, Observed_Wins_And_Losses, "wins and losses", "Outcome"); Print_Chi_Square(Expected_Rolls_Of_Kind, Observed_Rolls_Of_Kind, "dice rolls", "Roll"); Print_Chi_Square(Expected_Games_Of_Length, Observed_Games_Of_Length, "game lengths", "Length"); Print_Chi_Square(Expected_Passes_Of_Length, Observed_Passes_Of_Length, "pass lengths", "Length"); if Passes_All_Tests then Put_Line ("The generator PASSED all the tests"); else Put_Line ("The generator FAILED one or more tests"); end if; end Craps_Tests; One possible improvement worth making in the code is this: it should print the image of the (random) initial state, so that the run can be repeated (perhaps after modifying the generator) if it turns up anomalies. Thus, the program should also have the capability of resetting the generator initially from a user-supplied image of a state.