Re: more hand written integer pow() functions (LONG POST)
From: Sheldon Simms (sheldonsimms_at_yahoo.com)
Date: 11/03/03
- Next message: Sheldon Simms: "Re: double casts"
- Previous message: Joona I Palaste: "Re: "Mastering C Pointers"...."
- In reply to: James Hu: "Re: more hand written integer pow() functions (LONG POST)"
- Next in thread: James Hu: "Re: more hand written integer pow() functions (LONG POST)"
- Reply: James Hu: "Re: more hand written integer pow() functions (LONG POST)"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 03 Nov 2003 10:58:59 -0500
On Mon, 03 Nov 2003 02:40:05 -0800, James Hu wrote:
> Sheldon Simms <sheldonsimms@yahoo.com> wrote in message news:<pan.2003.11.02.22.10.44.251500@yahoo.com>...
>
> My unrolled implementation would probably need a computed
> goto for the array to improve the time. But realize that
> the version you used as the basis for you experiment was
> trying to avoid implementation defined behavior.
>
>> I came up with code pretty much exactly like your pow_b() a few
>> days ago by translating the OS/360 assembler above into C. One
>> small change then made it a bit faster. You'll be able to see the
>> change easily in the code below.
>
> It was unclear to me why your small change should make any
> difference. Do you have a theory?
I know why it's faster, but that comes from looking at the
assembly code generated. I actually only made the change to
avoid the infinite-loop-with-break construct that comes
out of the direct translation of the OS/360 code. Here's the
code again:
static int ibmpow (int x, int y)
{
int factor = 1;
if (y < 0) return 0;
if (y & 1) factor *= x;
y >>= 1;
while (y)
{
x *= x;
if (y & 1) factor *= x;
y >>= 1;
}
return factor;
}
One reason it is faster is that the generated code avoids the
first multiplication for odd y by replacing the if-statement
before the loop with this:
if (y & 1) factor = x;
y >>= 1;
But that's not everything. I tested by replacing the test loop
with
for (y = 1; y < 32; y <<= 1)
And the ibmpow() version is still faster than pow_b().
I can see that the generated code for the loop in ibmpow() is
a bit better than that for pow_b(). That's interesting since
the statements are the same, only rearranged.
> Please compare your routine with the routine below based on the
> refinement suggested by Tim Woodall and the implementation by
> Eric Sosman:
>
> int tim_eric_pow(int x, unsigned n)
> {
> int t = 1;
> if (n == 0) return 1;
> while (n ^ 1) {
> n >>= 1; x *= x;
> }
> t = x;
> while (n >>= 1) {
> x *= x;
> if (n & 1) t *= x;
> }
>
> return t;
> }
I must have missed that one in the thread. After small changes so
that it conforms to the same interface as the others (signed n, check
for n < 0), it's the fastest so far:
$ time ./ipows
real 0m2.032s
user 0m1.980s
sys 0m0.000s
> The tail recursive form for this would be:
...
> But, you would have to use -O3 to get gcc to remove the tail recursion.
I avoided O3 because it inlined the power function, sometimes.
-Sheldon
- Next message: Sheldon Simms: "Re: double casts"
- Previous message: Joona I Palaste: "Re: "Mastering C Pointers"...."
- In reply to: James Hu: "Re: more hand written integer pow() functions (LONG POST)"
- Next in thread: James Hu: "Re: more hand written integer pow() functions (LONG POST)"
- Reply: James Hu: "Re: more hand written integer pow() functions (LONG POST)"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|