Re: the "hat" container class [C++]

From: Peter Ammon (peter_ammon_at_rocketmail.com)
Date: 05/26/04


Date: Tue, 25 May 2004 19:04:25 -0700

Craig Nicol wrote:

> AngleWyrm wrote:
>
>> This container serves a special purpose: To be able to pull a random
>> sample
>> from a collection, where the chances of each item in the collection
>> are NOT
>> uniform. It can store any kind of user-defined or standard class
>> object, as
>> long as the object can be evaluated to an integer.
>>
>> The value of the objects in the container are the number of "chances"
>> that
>> object has, to be selected. It's probabability is
>> numChances/totalChances.
>> And of course totalChances is the sum of all objects.
>>
>> All critiques are welcome!
>>
> [snip code]
> Looks like a good idea. I notice however that your selection algorithm
> looks quite slow, and would take O(n) to pick an element out of a hat.
> Have you tried running it on a decent sized data set (~100,000 elements)?
>
> I've been looking at the same problem from the perspective of Genetic
> Algorithms, where this technique is used to pick chromosomes and is
> often referred to as roulette selection (i.e. imagine all the items
> in the collection are plotted on a pie chart, spin the chart and
> pick a random value).
>
> I've managed to create a O(log n) algorithm for this problem by
> creating a seperate vector of the same size as the data set. In
> this vector, I store the cumulative chances. I.e. for a vector
> v and a hat h, v[i] = h[0]+h[1]+...+h[i]
> All you need do then is a binary search over v to get the index
> of the randomly chosen element.
>
> What I've love to be able to do is create a constant-time algorithm
> for the random search. If your hat would do that, I'd be impressed.
>
> Regards,
> Craig Nicol.

A naive constant time algorithm would be to create an array whose length
is the sum of the weights, and populate it with the objects duplicated
according to their weight. E.g.

Object Weight
   a 2
   b 3
   c 5

[ a a b b b c c c c c ]

then pick a random element. An optimization would be to divide each
weight by the gcd of all the weights before generating the list.



Relevant Pages

  • Re: What I learned from Class Viewer
    ... displaced by such a trivially easy algorithm? ... by simply assuming that the weight is a distance between nodes ... while is not a valid circuit as a node is omitted. ...
    (comp.lang.java.programmer)
  • Re: Finding longest path in graph between two vertices
    ... a topological sort and than a relaxation to find the critical path ... Bellman-Ford Algorithm (due to the weight negation it should find not ... DAG shortest path algorithm (which for each topologically sorted vertex ...
    (sci.math)
  • Re: the "hat" container class [C++]
    ... where the chances of each item in the collection are NOT ... I notice however that your selection algorithm ... I've managed to create a Oalgorithm for this problem by ... All you need do then is a binary search over v to get the index ...
    (comp.programming)
  • Re: a chessboard type problem
    ... algorithm would be great. ... particular there are no directed cycles with a total negative weight. ... Consider each square on the chess board as a ... about from the start node to one of the last rank nodes. ...
    (comp.theory)
  • Re: a chessboard type problem
    ... algorithm would be great. ... particular there are no directed cycles with a total negative weight. ... Consider each square on the chess board as a ... about from the start node to one of the last rank nodes. ...
    (comp.theory)