Re: Non-restoring binary square root and convergence

From: Clint Olsen (clint_at_0lsen.net)
Date: 03/11/05


Date: Fri, 11 Mar 2005 16:12:32 -0600

Sorry it took so long to followup on this. I have an implementation which
works but exhibits the behavior I described in the original post (non-zero
difference in even input).

Notes:

Compiling:

% cc -o sqrt -Wall -pedantic -g sqrt.c

If you don't want to see the debug messages, use -DNDEBUG as well.

About the code:

The code is written so it doesn't make any assumptions about the width of
an unsigned long. It estimates this by taking the width of character and
then multiplying by the sizeof the parameterized type: sqrt_t. I made no
attempt to calculate the value bits of an unsigned long (see the ANSI C
spec if you don't know what this means). It's not really necessary for
this experiment, in real life this would produce the correct result for
width. The odd initialization of shift was to account for some oddball
machine that had an odd number of bits in an unsigned long.

I took the algorithm from "Algorithms for Extracting Square Roots and Cube
Roots", Peng, 1981.

Running:

% sqrt 9 16

Usually you want to perform 16 iterations to produce a correct result for a
machine with 32-bit unsigned longs. However, for the purposes of
generating psuedo-random sequences on primes, I was interested in the
behavior if you specified something more arbitrary, like:

% sqrt 2 40

I'd appreciate any comments about correctness and efficiency if you have
any.

Thanks,

-Clint

#include <stdio.h>
#include <stdlib.h>
#include <limits.h>

typedef unsigned long sqrt_t;

int main(int argc, char *argv[])
{
    sqrt_t a; /* target of sqrt */
    sqrt_t root = 0; /* root */
    sqrt_t diff; /* running difference */
    sqrt_t a_i, b_i;
    unsigned width = CHAR_BIT * sizeof(sqrt_t);
    int shift = width & 1 ? width - 1 : width - 2;
    int i;

    if (argc == 3) {
        a = strtoul(argv[1], NULL, 10);
        i = strtoul(argv[2], NULL, 10);

        /* pass 1 */
        a_i = a >> shift & 3;
        
        diff = a_i - 1;
        
        if (diff < a_i) /* overflow check */
            root = 1;
        
        for (shift -= 2, --i; i > 0; i--, shift -= 2) {
        
            if (shift >= 0) {
                a_i = a >> shift & 3;
            } else {
                a_i = 0;
            }
        
            b_i = diff << 2 | a_i;
        
            if (root & 1)
                diff = b_i - (root << 2 | 1);
            else
                diff = b_i + (root << 2 | 3);
        
            root <<= 1;
        
            if (diff < b_i)
                root |= 1;
        
            #ifndef NDEBUG
            printf("root: %lu, d: %lu, shift: %d\n", root, diff, shift);
            #endif
        }
        printf("%lu\n", root);
    } else {
        fprintf(stderr, "usage: %s <target> <iterations>\n", argv[0]);
    }

    return EXIT_SUCCESS;
}



Relevant Pages

  • Re: Non-restoring binary square root and convergence
    ... I took the algorithm from "Algorithms for Extracting Square Roots and Cube ... % sqrt 9 16 ... int main ...
    (sci.math)
  • Re: C++ more efficient than C?
    ... If you want to do language comparisons, ... complicated algorithm, though). ... int compare ... not been in C's standard library. ...
    (comp.lang.c)
  • Re: Mersenne Twister -- A Revised C++ Implementation
    ... "That algorithm is a slight elaboration on the basic linear congruential ... takes an int, and an int on the target you seem to be developing on -- ... unless you qualify it by stressing that you ... compilation to pick the appropriate type in a more general manner. ...
    (comp.lang.cpp)
  • Re: dijkstra algorithm issue
    ... I am just trying to implement Dijkstra's algorithm but i am not sure ... int n_vertices; ... draw a graph on paper,consider its adjacency matrix ... then compute by hand the shortest path you want on paper, ...
    (comp.programming)
  • Re: Second TI-Nspire report
    ... without sqrt in the denominator of the matrix elements? ... extension of Q containing all the sqrt, ... should take around 10 minutes with the current Xcas algorithm. ... Or maybe mathematica has a better algorithm, ...
    (comp.sys.hp48)