# Re: implementation note for scapegoat tree

*From*: "user923005" <dcorbit@xxxxxxxxx>*Date*: 28 Mar 2007 14:58:18 -0700

On Mar 28, 10:51 am, Ben Pfaff <b...@xxxxxxxxxxxxxxx> wrote:

I'm what you might call a balanced binary tree enthusiast, and so

when I rather belatedly encountered the idea of scapegoat trees a

few months ago I decided I'd take a stab at implementing them

when I got a chance. I finally got around to it this week and I

thought I'd pass along something that I'd noticed during

implementation.

In case you don't know what a scapegoat tree is, you can look at

http://en.wikipedia.org/wiki/Scapegoat_tree

which includes a link to the original Galperin and Rivest paper

on them.

One potentially tricky part of scapegoat tree design is the

choice of alpha, which is a real constant picked by the

implementor, that must be greater than 0.5 and less than 1. We

must be able to efficiently calculate h_alpha =

floor(log(n)/log(1/alpha)) for integer n > 0. One way to do so

would be to maintain a table relating h_alpha to the minimum

value of n that yields that h_alpha. Then, we can track the

value of h_alpha(n) in the number of nodes in the tree n as nodes

are inserted and deleted.

But I decided to use a different approach. I chose alpha =

sqrt(2)/2 = 1/sqrt(2) ~= .707. Then, computing h_alpha is a

matter of computing a logarithm in base sqrt(2). This is

actually easy: simply compute twice the base-2 logarithm, then

adjust upward by 1 if necessary.

Here's the code I ended up with for this. I omit the

implementation of count_leading_zeros, which can be computed with

a single machine instruction on many architectures (including

x86):

/* Returns floor(log2(x)).

Undefined if X is zero. */

static inline int

floor_log2 (size_t x)

{

return sizeof (size_t) * CHAR_BIT - 1 - count_leading_zeros (x);

}

/* Returns ceil(pow(sqrt(2), x * 2 + 1)).

Defined for X from 0 up to the number of bits in size_t minus

1. */

static inline size_t

pow_sqrt2 (int x)

{

/* These constants are sqrt(2) multiplied by 2**63 or 2**31,

respectively, and then rounded to nearest. */

#if SIZE_MAX >> 31 >> 1

return (UINT64_C(0xb504f333f9de6484) >> (63 - x)) + 1;

#else

return (0xb504f334 >> (31 - x)) + 1;

#endif

}

/* Returns floor(log(n)/log(sqrt(2))).

Undefined if N is 0. */

static inline int

calculate_h_alpha (size_t n)

{

int log2 = floor_log2 (n);

/* The correct answer is either 2 * log2 or one more. So we

see if n >= pow(sqrt(2), 2 * log2 + 1) and if so, add 1. */

return (2 * log2) + (n >= pow_sqrt2 (log2));

}

With -O3, GCC compiles calculate_h_alpha to a body of only ten or

so simple arithmetic instructions on x86.

How well does your tree perform?

.

**Follow-Ups**:**Re: implementation note for scapegoat tree***From:*Richard Heathfield

**Re: implementation note for scapegoat tree***From:*Ben Pfaff

**References**:**implementation note for scapegoat tree***From:*Ben Pfaff

- Prev by Date:
**Re: Loop profiling** - Next by Date:
**Re: implementation note for scapegoat tree** - Previous by thread:
**implementation note for scapegoat tree** - Next by thread:
**Re: implementation note for scapegoat tree** - Index(es):