GNU Free Documentation License, Version 1.2

- User documentation for RandomBoolSteam
- Maintainer documentation for RandomBoolSteam
- Bugs, Shortcomings and other ideas

The class `RandomBoolSteam`

is supposed to model a binary random variable
(with 50% probability of being "true" and 50% of being "false", and
with independent samples). See also `RandomLongStream`

for generating
random machine integers, and `RandomZZStream`

for generating large random integers.
An alternative way of generating random values is to use a `RandomSource`

.

There are three ways of creating a new `RandomBoolSteam`

object:

RandomBoolStream RBS1; // default ctor, seeded with 0 RandomBoolStream RBS2(n); // seed generator with abs(n)

If you create more than one `RandomBoolStream`

object with the same
seed, they will each produce exactly the same sequence of bits. In
particular, to obtain different results each time a program is run,
you can for instance seed the generator with the system time (*e.g.*
by supplying as argument `time(0)`

); this is likely desirable unless
you're trying to debug a randomized algorithm.

Once you have created a `RandomBoolStream`

you may perform the following
operations on it:

*RBS // get the current value of RBS (as a boolean). ++RBS // advance to next value of RBS. RBS++ // advance to next value of RBS **INEFFICIENTLY**. sample(RBS) // advance RBS and then return new value; same as ``*++RBS`` prob(P, RBS) // use bits from RBS to produce a boolean with prob P of being // true; on average uses less than 2 samples from RBS per call. out << RBS // print some information about RBS. RBS.myIndex() // number of times RBS has been advanced, // same as the number of random bits generated.

Note that a `RandomBoolStream`

supports input iterator syntax.

Note `prob`

should work correctly even with probabilities close to 0
(*e.g.* < 1/10^10).

You may assign or create copies of `RandomBoolStream`

objects; the copies
acquire the complete state of the original, so will go on to produce exactly
the same sequence of bits as the original will produce.

The idea is very simple: use the pseudo-random number generator of GMP to generate
a random integer, and then give out the bits of this integer one at at time;
when the last bit has been given out, get a new random integer from the GMP generator.
The random integer is kept in the data member `myBuffer`

, and `myBitIndex`

is
used to read the bits one at a time.

Among the data members, there are two mild surprises: `mySeed`

and
`myCounter`

. I put these in to help any poor blighter who has to debug a
randomized algorithm, and who may want to "fast forward" the `RandomBoolSteam`

to the right place.

The data member `myState`

holds all the state information used by the GMP
generator. Its presence makes the ctors, dtor and assignment messier than
they would have been otherwise.

The advancing and reading member functions (*i.e.* `operator++`

and `operator*`

)
are inline for efficiency, as is the `sample`

function. The condition for
refilling `myBuffer`

is when the index goes beyond the number of bits held in
`myBuffer`

.

`myFillBuffer`

also sets the data member `myBitIndex`

to zero; it seemed most
sensible to do this here.

The function `prob`

is nifty; if you think about it for a moment, it is
obviously right (and economical on random bits). It would be niftier still
if the probability were specified as an `unsigned long`

-- on a 64-bit
machine this ought to be sufficient for almost all purposes. If the requested
probability is of the form N/2^k for some odd integer N, then the average
number of bits consumed by `prob`

is 2-2^(1-k), which always lies between 1 and 2.
If the requested probability is 0 or 1, no bits are consumed.

Is run-time speed so important here? How many algorithms really consume millions of random bits? Surely the computations on the random bits/values should exceed the cost of generating them, shouldn't they?

Is it really worth having inline functions here? `operator++()`

is long
enough that it is hard to justify keeping it inline.

Should `sample`

advance **before** or **after** getting the value?

Is the information printed by `myOutputSelf`

adequate? Time will tell.

Discarded idea: have a ctor for RBS which take a ref to a `RandomSource`

,
and which uses that to obtain randomness. I discarded the idea because of
the risks of an "invisible external reference" (e.g. a dangling reference,
or problems in multithreaded code). Instead of passing a reference to a
RandomSource to the ctor, you can use the RandomSource to create an initial
seed which is handed to the ctor -- this gives better separation.