Re: Extended-Precision Division



Matt wrote:

I am ultimately looking for a way to perform a 96-bit / 64-bit divide. This
is pretty simple to implement via a pair of 64-bit / 64-bit divides, so this
is the problem that I am focusing on at the moment. I know the basic
shifting implementation, but that is excruciatingly slow. I am estimating
that it would be upwards of 300 cycles for the 64-bit / 64-bit divide, so the full 96-bit / 64-bit divide would be around 600 cycles.


Isn't there a faster way?

'There is always a faster way'

(Probably written by Mike Abrash)

First, do you have a 64-bit cpu, like a recent AMD or Intel?

If so, the problem is trivial, since the x86 DIV opcode and register assignements are designed for exactly this sort of stuff:

(I don't have the 64-bit instruction manuals here, so I'll show the solution for a 48 / 32 divide using a 32-bit cpu)

; Divide a [48 bits] by b [32 bits], returning the result in EAX
; and the remainder in EDX. This will fail if the actual result
; doesn't fit in 32 bits!

  movzx eax, word ptr a[4]
  xor edx,edx
  mov ebx,[b]
  div ebx
; At this point in time EDX contains the remainder, and EAX
; should hopefully be zero:

;  mov [overflow],eax	; Keep this if you want an overflow indication
  mov eax, dword ptr a[0]
  div ebx

That's it!

Doing 64-bit divides on a 32-bit machine is a _lot_ harder, so several workarounds might be employed:

1) Solve it by integer DIVs using successive approximation:
  a) Shift the divisor up so that it has no leading zero bits.
     Remember the shift count
     Worst case this doesn't save any bits at all.
  b) Divide the top 64 (or 63 to avoid overflow) bits by this
     32-bit number, then back-multiply using the full 64-bit
     divisor, and subtract the result from the original
     96-bit number.

  c) Repeat (b) one more time, and you should be done.

2) Do the same as (1), but start by calculating a reciprocal divisor that you can use to multiply by instead.

3) Load both the divisor and dividend into fp registers, calculate an approximate answer, then convert this back to a 64-bit integer value.

Do a single back-multiply and subtract to decide if you need to repeat the process, or if you can get away with a single-bit adjustment.

Terje
--
- <Terje.Mathisen@xxxxxxxxxxxxx>
"almost all programming can be viewed as an exercise in caching"

.



Relevant Pages

  • Re: Wieferich primes and a wikipedia statement
    ... > one direction, that this implies p is Wieferich, in my prior ... And assumed that q is an integral divisor of p-1, ... p-1 and then follows q divides p-1 instead and thus p² ...
    (sci.math)
  • Re: random ints on symmetric interval
    ... divides the appropriate generator value, ... Later you subtract i_max, ... The PDF of the difference is tri ...
    (comp.lang.fortran)
  • Re: FLT : (2x+1)^p + 2^p y^p = (2y+1)^p
    ... quasi wrote: ... In any case, I can see for that any odd prime factor, ... Now, if 2^k exactly divides z - r for odd z,r, ... any divisor of y equals 1 mod p and y must ...
    (sci.math)
  • Re: Newbie question(s)...
    ... > contradiction; assume that for a prime number, ... numbers such that the greatest common divisor of n an m is 1. ... also be the case that p divides n (because the exponent for each prime ...
    (sci.crypt)
  • Re: FLT : (2x+1)^p + 2^p y^p = (2y+1)^p
    ... By Lagrange's Theorem any non-intrinsic prime factor of the RHS must ... In any case, I can see for that any odd prime factor, q say, of the ... Now, if 2^k exactly divides z - r for odd z,r, then no higher ... any divisor of y equals 1 mod p and y must equal 1 mod p. ...
    (sci.math)