Re: ISO help in probability



Helmut Giese wrote:

>I have a set of variables each of which has a certain weight or
>probability assigned. Say
>a -> 50
>b -> 30
>c -> 10
>d -> 10
>
>I want an algorithm which (in the above example) selects 'a' in every
>other run and 'c' in 10% of the runs (approximately of course).
>
>My attempt below fails miserably. Using the weights above, 'a' scores
>way too high, mostly at the expense of 'c' and 'd', which are far away
>from their expected 10% shares.
>Evidently my logic is flawed: Foreach variable I get the current
>probability by multiplying a random value with the associated weight,
>and then select the one with the highest result.
>
>Any advice on the 101 of probability applicable here will be greatly
>appreciated.

First, compute the cumulative sum (or "running total")
of the variables and their weights, and construct a new
list that pairs each variable with the

set csum 0
set clst [list]
foreach {k w} $lst {
set csum [expr {$csum + $w}]
lappend clst $k $csum
}

Next, generate a random number between 0 and the total weight:

set rn [expr {rand() * $sum}]

Then select the first element in the transformed list with a
cumulative sum greater than that number:

foreach {k c} $clst {
if {$c > $rn} { break }
}
# ==> $k is the answer.


With the sample data above, $clst is:

a 50 b 80 c 90 d 100

So a random number between 0 and $csum == 100 will
be <50 half the time, selecting 'a'; will lie between
50 and 80 30% of the time, yielding 'b'; will lie between
80 and 90 10% of the time, et cetera.

Tweak the algorithm above as needed depending on the required
performance characteristics. (If you need to do a lot of
selections from a fixed data set, a binary search through
the cumulative sum list will be profitable.)


--Joe English
.



Relevant Pages

  • Re: Looking for an Introduction to Statistics
    ... Statistics from the viewpoint of Set Theory? ... uses sets, except through probability. ... (a weight need not be an integer) ... discrete or limits of discrete, ...
    (sci.math)
  • Re: Context modelling
    ... context modelling in statistical analysis? ... I wrote a technical report describing PAQ6. ... where w is the mixing weight, x is the input probability from a model, ... p is the output probability, and y is the actual bit, and a is ...
    (comp.compression)
  • ISO help in probability
    ... Foreach variable I get the current ... probability by multiplying a random value with the associated weight, ... array set weight $lst ...
    (comp.lang.tcl)
  • i have problem with "probability of undetected error" equation
    ... on probability of undetected error equation, there was A (the weight ... distribution function of the CRC code depending of the generator).. ...
    (sci.math)
  • i have problem with probability of undetected error equation, help me please..
    ... on probability of undetected error equation, there was A (the weight ... distribution function of the CRC code depending of the generator).. ...
    (sci.electronics.basics)