Re: Is there a better way to simulate randomly choosing from a weighted set?
- From: "Googmeister" <googmeister@xxxxxxxxx>
- Date: 27 Feb 2006 03:50:19 -0800
yay_frogs@xxxxxxxxx wrote:
[I originally posted this to comp.lang.c++ since that is the language
I'm working in, but then I realized that my problem isn't language
specific and this would probably be a better group.]
Here is my problem: suppose there are, say, five events with these
probabilities:
event1 0.7
event2 0.1
event3 0.1
event4 0.05
event5 0.05
Note that sum of the probabilities is 1.0. I would like a function that
simulates these events and returns an int to indicate which event
occurred: the function should statistically return 1 about 70% of the
time, 2 about 10% of the time, and so on.
I have figured out a way to do this, but I suspect my way is
suboptimal.
I build a vector of five elements that looks like:
( 0.05, 0.05+0.05, 0.05+0.05+0.1, 0.05+0.05+0.1+0.1,
0.05+0.05+0.1+0.1+0.7 )
= ( 0.05, 0.1, 0.2, 0.3, 1.0 )
I then generate a random float in the interval 0.0 ... 1.0, and if the
random float is in the range 0 to 0.05, I return event 5, and if the
random float is in the range 0.05-0.1, I return event 4, and so on.
(Actually, I should test for event 1 first since it is most common, but
I'm too lazy to re-type my example vector above.)
If you will be sampling many times from the same discrete
distribution (and the number of elements is large), you can
find the element more efficiently by binary searching the
cumulative probabilities. (It's also possible to do in constant
time using a technique of Warren Smith.)
For my real problem, I have to deal with many different cases where the
number of events to consider constantly varies, and I suspect there has
to be a better way than building a vector to represent the different
ranges a random variable can fall in and then seeing which range it
falls in.
So is there a better way?
It depends on the number of elements and how often you
will be sampling from the same (or nearby) discrete
distribution.
If the discrete distribution changes slowly, you can maintain
the cumulative probabilities in a binary tree structure
(each node stores the total weight beneath it) and sample
and update in logarithmic time.
.
- References:
- Prev by Date: Is there a better way to simulate randomly choosing from a weighted set?
- Next by Date: Re: Is there a better way to simulate randomly choosing from a weighted set?
- Previous by thread: Is there a better way to simulate randomly choosing from a weighted set?
- Next by thread: Re: Is there a better way to simulate randomly choosing from a weighted set?
- Index(es):
Relevant Pages
|