Re: random integer



bob@xxxxxxxxxxxxxx wrote:
I was just wondering what the best way to generate a uniformly
distributed random integer between 0 and x is.

Usually, you can't just do rand()%(x+1). For instance, if you want a
number between 0 and 31999, you can't just do rand()%32000 since
0-767 will be twice as likely.

You can always just throw out the few possible return values of rand()
that could throw off the distribution:

// find random number evenly distributed over [0 .. x)
int rand_from_interval (int x) {
int big_interval_size = RAND_MAX + 1;
int interval_count = big_interval_size / x;
int cutoff = interval_count * x;

int y;
do {
y = rand();
} while (y >= cutoff);

// now y is in [0 .. cutoff)
// and, cutoff is a multiple of x.

return y % x;
}

Unless rand() behaves funny, this should usually be pretty efficient.
The worst case is when x = RAND_MAX/2 + 1. Then you will have to
throw out half of the return values of rand(), and your loop could
end up doing several iterations. But the probability that you would
get lots of inconvenient return values from rand() in a row is small.
It has at worst a 50/50 chance of giving you an inconvenient value,
so the probability of needing N iterations is O(log(n)).

If calling rand() is really expensive, you can in theory keep the
unused random bits and use them in the future. There are two areas
where you get unused random numbers generated as a side effect of
running the above algorithm:

1. Since you are returning y % x, you are throwing away bits of
randomness right there. This is because there are interval_count
sub-intervals of [0..big_interval_size), each of size x,
and y lies in one of those sub-intervals. y % x tells you
position in the sub-interval, but it throws away information
about which sub-interval. y / x is the number that tells which
sub-interval you landed in, and it just so happens to be a random
number evenly distributed over the interval [0 .. interval_count).
All of this only applies when y < cutoff.

2. When y >= cutoff, then y - cutoff is a random number evenly
distributed along the interval [0 .. big_interval_size - cutoff).

Unfortunately, I can't think of an easy way to take a bunch of
random numbers over intervals you can't control and turn them into
a single random number over a bigger interval, apart from stripping
out the high order bit and truncating the rest. Which works, but
it does throw away parts of bits of randomness here and there.

- Logan
.



Relevant Pages

  • Re: Random Number Generation Dialog Box
    ... > distribution. ... A help search tells me about what looks like a wonder dialog ... Be aware, however, that the ATP's pseudorandom number generator is not ... XL01's RAND() ...
    (microsoft.public.mac.office.excel)
  • Re: question about random generator
    ... The rand() function generates uniformly distributed ... distribution because doing so would require a lot of fairly ... matter -- certainly not a topic for the Standard to explicate. ... that it means "sort of uniform-ish in a sort of hand-waving way, ...
    (comp.lang.c)
  • Re: Explaining significance to laypersons: why the tails?
    ... >>do because it relies on an arbitrary significance level of 0.05, ... > The choice of cutoff needs to be more complicated than that. ... if the distribution is not symmetric then in an extreme ... under the upper tail, but reject huge amounts under the lower tail. ...
    (sci.stat.math)
  • Re: Random Numbers?
    ... It is likely to overflow, ... The distribution is uneven for large values of. ... To get the range x to y, inclusive, the following approximates even ... y, is, in general, r:, assuming that the distribution or rand() is ...
    (comp.lang.c)
  • Re: question about random generator
    ... >> random number that follow a normal distribution N? ... > rand() function which returns an int ranging from 0 to RAND_MAX. ... > no random number functions in ISO C which return a floating point value. ... If you care about the quality of your random numbers, my advice would ...
    (comp.lang.c)