Re: more hand written integer pow() functions (LONG POST)

From: Sheldon Simms (sheldonsimms_at_yahoo.com)
Date: 11/03/03


Date: Mon, 03 Nov 2003 10:58:59 -0500

On Mon, 03 Nov 2003 02:40:05 -0800, James Hu wrote:

> Sheldon Simms <sheldonsimms@yahoo.com> wrote in message news:<pan.2003.11.02.22.10.44.251500@yahoo.com>...
>
> My unrolled implementation would probably need a computed
> goto for the array to improve the time. But realize that
> the version you used as the basis for you experiment was
> trying to avoid implementation defined behavior.
>
>> I came up with code pretty much exactly like your pow_b() a few
>> days ago by translating the OS/360 assembler above into C. One
>> small change then made it a bit faster. You'll be able to see the
>> change easily in the code below.
>
> It was unclear to me why your small change should make any
> difference. Do you have a theory?

I know why it's faster, but that comes from looking at the
assembly code generated. I actually only made the change to
avoid the infinite-loop-with-break construct that comes
out of the direct translation of the OS/360 code. Here's the
code again:

static int ibmpow (int x, int y)
{
  int factor = 1;
  if (y < 0) return 0;

  if (y & 1) factor *= x;
  y >>= 1;

  while (y)
    {
      x *= x;
      if (y & 1) factor *= x;
      y >>= 1;
    }

  return factor;
}

One reason it is faster is that the generated code avoids the
first multiplication for odd y by replacing the if-statement
before the loop with this:

  if (y & 1) factor = x;
  y >>= 1;

But that's not everything. I tested by replacing the test loop
with

  for (y = 1; y < 32; y <<= 1)

And the ibmpow() version is still faster than pow_b().

I can see that the generated code for the loop in ibmpow() is
a bit better than that for pow_b(). That's interesting since
the statements are the same, only rearranged.

> Please compare your routine with the routine below based on the
> refinement suggested by Tim Woodall and the implementation by
> Eric Sosman:
>
> int tim_eric_pow(int x, unsigned n)
> {
> int t = 1;
> if (n == 0) return 1;
> while (n ^ 1) {
> n >>= 1; x *= x;
> }
> t = x;
> while (n >>= 1) {
> x *= x;
> if (n & 1) t *= x;
> }
>
> return t;
> }

I must have missed that one in the thread. After small changes so
that it conforms to the same interface as the others (signed n, check
for n < 0), it's the fastest so far:

$ time ./ipows
 
real 0m2.032s
user 0m1.980s
sys 0m0.000s

> The tail recursive form for this would be:
...
> But, you would have to use -O3 to get gcc to remove the tail recursion.

I avoided O3 because it inlined the power function, sometimes.

-Sheldon



Relevant Pages

  • Re: convert int[] to byte[]
    ... How the use of a BinaryWriter will avoid the need of a loop? ... >> convert each int to a byte and then write to the stream one by one. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: faster solution
    ... The following piece of code (a search algorithm) has to be evaluated ... int search ... xi and xppare independent of the while loop above. ... You can avoid the while loop completely for the cases where either of ...
    (comp.lang.c)
  • Re: Fridays the thirteenth. (And a little puzzle.)
    ... -- compiler) is the usual method ... int febdays ... -- We're going to go round a loop dealing with each year in turn. ... -- other languages call) ...
    (uk.people.silversurfers)
  • Re: C code is not generating required results.
    ... int main ... the array by the size of an element. ... prevent the user of your program from entering more characters than ... want to loop. ...
    (comp.lang.c)
  • Re: long double versions of functions in gcc under Cygwin
    ... rather than the nearest enclosing one) and a decent exception ... them it doesn't seem like goto usage would be affected ... int typfun() ... Why use a for loop when it is just a while loop in disguise? ...
    (comp.lang.c)