Re: implementation note for scapegoat tree
 From: "user923005" <dcorbit@xxxxxxxxx>
 Date: 28 Mar 2007 14:58:18 0700
On Mar 28, 10:51 am, Ben Pfaff <b...@xxxxxxxxxxxxxxx> wrote:
I'm what you might call a balanced binary tree enthusiast, and so
when I rather belatedly encountered the idea of scapegoat trees a
few months ago I decided I'd take a stab at implementing them
when I got a chance. I finally got around to it this week and I
thought I'd pass along something that I'd noticed during
implementation.
In case you don't know what a scapegoat tree is, you can look at
http://en.wikipedia.org/wiki/Scapegoat_tree
which includes a link to the original Galperin and Rivest paper
on them.
One potentially tricky part of scapegoat tree design is the
choice of alpha, which is a real constant picked by the
implementor, that must be greater than 0.5 and less than 1. We
must be able to efficiently calculate h_alpha =
floor(log(n)/log(1/alpha)) for integer n > 0. One way to do so
would be to maintain a table relating h_alpha to the minimum
value of n that yields that h_alpha. Then, we can track the
value of h_alpha(n) in the number of nodes in the tree n as nodes
are inserted and deleted.
But I decided to use a different approach. I chose alpha =
sqrt(2)/2 = 1/sqrt(2) ~= .707. Then, computing h_alpha is a
matter of computing a logarithm in base sqrt(2). This is
actually easy: simply compute twice the base2 logarithm, then
adjust upward by 1 if necessary.
Here's the code I ended up with for this. I omit the
implementation of count_leading_zeros, which can be computed with
a single machine instruction on many architectures (including
x86):
/* Returns floor(log2(x)).
Undefined if X is zero. */
static inline int
floor_log2 (size_t x)
{
return sizeof (size_t) * CHAR_BIT  1  count_leading_zeros (x);
}
/* Returns ceil(pow(sqrt(2), x * 2 + 1)).
Defined for X from 0 up to the number of bits in size_t minus
1. */
static inline size_t
pow_sqrt2 (int x)
{
/* These constants are sqrt(2) multiplied by 2**63 or 2**31,
respectively, and then rounded to nearest. */
#if SIZE_MAX >> 31 >> 1
return (UINT64_C(0xb504f333f9de6484) >> (63  x)) + 1;
#else
return (0xb504f334 >> (31  x)) + 1;
#endif
}
/* Returns floor(log(n)/log(sqrt(2))).
Undefined if N is 0. */
static inline int
calculate_h_alpha (size_t n)
{
int log2 = floor_log2 (n);
/* The correct answer is either 2 * log2 or one more. So we
see if n >= pow(sqrt(2), 2 * log2 + 1) and if so, add 1. */
return (2 * log2) + (n >= pow_sqrt2 (log2));
}
With O3, GCC compiles calculate_h_alpha to a body of only ten or
so simple arithmetic instructions on x86.
How well does your tree perform?
.
 FollowUps:
 Re: implementation note for scapegoat tree
 From: Richard Heathfield
 Re: implementation note for scapegoat tree
 From: Ben Pfaff
 Re: implementation note for scapegoat tree
 References:
 implementation note for scapegoat tree
 From: Ben Pfaff
 implementation note for scapegoat tree
 Prev by Date: Re: Loop profiling
 Next by Date: Re: implementation note for scapegoat tree
 Previous by thread: implementation note for scapegoat tree
 Next by thread: Re: implementation note for scapegoat tree
 Index(es):
Relevant Pages
