Re: Assemblers are for hiding your work , not for faster code .



Hi John,

... which was both bad and customer-visible.

I also have found out that this combination won't go from itself
in instrumentation making, eventually we have to act on it :-).

I tested it in PowerBasic (using 64-bit integer variables) and it does
seem to work. 4 lines of real code, not bad.

I am not quite sure what your assembler does, but the second "divu"
line seems suspicious to me (looks like you just overwrite the
remainder
from the first div and divide d1 instead).

The CPU32 has indeed nice arithmetics, no doubt. The 32-bit
PPC has also all of it except the mentioned 64/32 division...
Well, the 32/32 takes an extra multiply and subtract to get to the
remainder, but those are single cycle operations... and IIRC the divide
was something like 17 cycles, that at hundreds of MHz. This is how
it goes:

00000000 7D66 5B78 7D6B 4B96 divul.l- d1,d2:d3
00000000: 7D665B78 or r11,r11,r6 dividend extra copy
00000004: 7D6B4B96 divwu r9,r11,r11 actual divide - quotient in
d3 (r11)
00000008: 7CE959D6 mullw r9,r11,r7 quotient*divisor in r7
0000000C: 7D473050 subf r7,r6,r10 remainder in d2 (or r10)

Dimiter

------------------------------------------------------
Dimiter Popoff Transgalactic Instruments

http://www.tgi-sci.com
------------------------------------------------------

John Larkin wrote:
On 10 Jan 2007 09:03:22 -0800, "Didi" <dp@xxxxxxxxxxx> wrote:


But wait to see the div 64/32 - the PPC has no such thing, so I had to
do it like this:

00000008 38C0 0020 7CC9 03A6 divu.l- d1,d2:d3 - source line,
resulting in:
00000008: 38C00020 addi 32,r0,r6
0000000C: 7CC903A6 mtspr r6,288
00000010: 5526F802 rlwinm r9,31,0,1,r6
00000014: 5527F87E rlwinm r9,31,1,31,r7
00000018: 7C865810 subfc r6,r11,r4
0000001C: 7CA75110 subfe r7,r10,r5
00000020: 7D000400 mcrxr 2
00000024: 418A0014 brc 12,10,5
00000028: 554A083C rlwinm r10,1,0,30,r10
0000002C: 516A0FFE rlwimi r11,1,31,31,r10
00000030: 556B083C rlwinm r11,1,0,30,r11
00000034: 48000014 b 5
00000038: 54AA083C rlwinm r5,1,0,30,r10
0000003C: 508A0FFE rlwimi r4,1,31,31,r10
00000040: 548B083C rlwinm r4,1,0,30,r11
00000044: 396B0001 addi 1,r11,r11
00000048: 421FFFD0 brc 16,31,-12
.....


I just did this today, a little thing to divide a 64-bit integer by
10. I had been using a partial-product multiply by 0.1, which worked
fine for most of my data (time delays) but rounded a teeny bit for DDS
frequencies, which was both bad and customer-visible. Setting f=10 MHz
gave f=10.00000001 MHz!


So I think this will work:

; DIVIDE D0:D1 BY 10, EXACTLY

D10: MOVEM.L D3 D4, -(SP) ; SAVE SCRATCHPADS

MOVE.L # 10, D4 ; IS COMRADE CONSTANT!
CLR.L D3 ; TREAT HI AS 64-BIT INT
DIVU.Q D4, D3:D0 ; D0 = HI/10 D3 = REM
DIVU.Q D4, D3:D1 ; DIVIDE REM:LO BY 10, QUO IN D1

MOVEM.L (SP)+, D3 D4 ; RETURN SCRATCH REGS
RTS


I tested it in PowerBasic (using 64-bit integer variables) and it does
seem to work. 4 lines of real code, not bad.


John

.