# Re: K random numbers

• From: cri@xxxxxxxx (Richard Harter)
• Date: Wed, 18 Nov 2009 03:42:46 GMT

On Tue, 17 Nov 2009 23:45:32 +0000, Ben Bacarisse
<ben.usenet@xxxxxxxxx> wrote:

cri@xxxxxxxx (Richard Harter) writes:

On Tue, 17 Nov 2009 16:55:26 +0000, Ben Bacarisse
<ben.usenet@xxxxxxxxx> wrote:
cri@xxxxxxxx (Richard Harter) writes:
<snip>
Sigh. Marek Chrobak and I did a note on this in IFIPS in 1988.
Quite maddeningly, I mislaid the my copy of the preprint and
can't seem find the time to reconstruct the details. However the
key idea is that Fisher-Yates only alters at most min(2k,n)
locations so that is all that you need to keep track of. To do
this, keep two arrays, one that has the permutation so far, and
the second that has the contents of the permutation element.

Thus, if the first two exchanges are (1,382),(2,17) the arrays
begin
[382, 17,...]
[ 1, 2,...]

If 382 were to occur at step 3 we see that 1 is now in cell 382
so the two arrays are now

[382, 17, 1,...]
[ 3, 2,382,...]

So, yes, you can get a uniformly distributed permutation with
O(k) storage. The obvious way to get O(k*log(k)) time is to put
everything in a binary search tree. The details should be
obvious.

That's a nice way to do it conserving random numbers, but if you have
RNs to spare you can use Knuth's Algorithm S to put k numbers into an
array and permute that. That gives you O(k) space and time.

IIANM Knuth's algorithm S has average time O(n), not O(k). (I am
going by the first edition; I am assuming that he hasn't revised
it since then.) More than that, you need to specify k in
advance. The plus is that it does return an ordered sequence.

Ah, yes, I conflated the exercises with the algorithm in the text. In
my edition the exercises include the refinements of Jeffrey Scott
Vitter which gives O(k) time and O(1) space (O(k) is store the result
of course!). This is published in "Faster methods for random
sampling" CACM, vol 27, no 7, 1984. "Refinement" may be too mean a
term: it is a better algorithm -- you calculate the skip length rather
than choosing yes/no once per item.

Yes, that would be better. What does Vitter use for the
probability distribution of the skip distance?

Richard Harter, cri@xxxxxxxx
http://home.tiac.net/~cri, http://www.varinoma.com
Infinity is one of those things that keep philosophers busy when they
could be more profitably spending their time weeding their garden.
.

## Relevant Pages

• Re: K random numbers
... the second that has the contents of the permutation element. ... Thus, if the first two exchanges are,the arrays ... IIANM Knuth's algorithm S has average time O, ... Ah, yes, I conflated the exercises with the algorithm in the text. ...
(comp.programming)
• Re: K random numbers
... One of the appraoch is calling random function again and again. ... the second that has the contents of the permutation element. ... Thus, if the first two exchanges are,the arrays ... get it .I mean i couldn't understand what the algorithm says. ...
(comp.programming)
• Re: Reordering integers 1..n to appear random
... There would be one function which can perform the translation from the ... create a random string of the intended length ... there's no guarantee that this algorithm will ... So all you need is a random permutation. ...
(sci.math)
• Re: Math.random
... That's not the only possible algorithm; ... without much difficulty implement a long shift-register generator; ... doublevalue or null/undefined/false for auto reseeding, ... I think 0 to 364 would be better; it goes well with zero-based arrays, ...
(comp.lang.javascript)
• Some PERL code for detecting/displaying "toroidal" trees "invariant under a half-turn
... This algorithm will be reviewed by Dr. David Wagner ... Such trees can be termed "invariant" under a half-turn ... not display "the same" oDAG as the first, ... and detects whether this permutation will yield a tree ...
(sci.math.num-analysis)