Re: Revised Index versus Subscript



"Robert" <no@xxxxxx> schrieb im Newsbeitrag
news:nvtlf3hflf9ispjjcjmha203ml6fa8s7h2@xxxxxxxxxx
On Wed, 26 Sep 2007 15:55:59 +0200, "Roger While" <simrw@xxxxxxxxxxxx>
wrote:

While the original is getting buried with OT.

I really, really tried to keep away from this subject but ...
One of the problems with the speed2 prog is the
attempt to deduce the perform cost.
Now OC produces exactly the C code that reflects
the statements eg.
/* speed2.cob:63: PERFORM */
{
for (n0 = ((int)COB_BSWAP_32(*(int *)(b_18 + 30))); n0 > 0; n0--)
{
{
/* speed2.cob:64: EXIT */
{
goto l_5;
}
}

/* EXIT PERFORM CYCLE 5: */
l_5:;
}
}

(For those not informed, OpenCobol produces intermedaite C code)

BUT gcc (in current versions) is far more
clever and deletes the whole thing :-)

Micro Focus did the same when the loop was empty or contained only
CONTINUE. I added exit
perform cycle to stop it from optimizing out the null loop. Since that
didn't do the trick
with OC, you should have found SOMEthing that made it run the null loop
repeat-count
times. Failing to measure and then subtract loop control time distorts the
relative speeds
reported.

Suppose loop control takes 5 time units, index takes 5 and subscript takes
10. The program
should report that subscript takes twice as long (10/5). Without
discounting loop control,
it will report 1.5 times as long (15/10).


Results from Linux boxen (in single-user mode)
(As all benchmarks should be done on 'nix systems)
(32 bit is P4 prescott with 3.2GhZ)
(64 bit is P4 650 3.4 GhZ)

MF SE 2.2 (Linux x86 32 bit)
cob -u -O -C notrunc -C sourceformat=free speed5.cob
cobrun speed5

Index start 2007092612363397+0200
Index end 2007092612363681+0200 3.3 - .5 / .9 = 3.1
COMP start 2007092612363681+0200
COMP end 2007092612364047+0200 3.7 3.6
COMP-5 start 2007092612364047+0200
COMP-5 end 2007092612364361+0200 3.1 2.9

OC 0.33 current -
cobc -x -O2 -std=mf -free speed5.cob
./speed5

Index start 2007092612311407+0200
Index end 2007092612311690+0200 2.8 - .5 / .9 = 2.6
COMP start 2007092612311690+0200
COMP end 2007092612312044+0200 3.5 3.3
COMP-5 start 2007092612312044+0200
COMP-5 end 2007092612312326+0200 2.1 1.8

You removed the null loop and computation of test time. The machines are
called COMPUTERS
because they're good at COMPUTING. There's no reason to make the human do
it.

Above I subtracted an estimated .5 for loop control and adjusted for 9/10
difference in
repeat factor. OC is 38% faster than MF for tests 1 and 3, which use
native integers, only
slightly faster on test 2, which uses a big-endian integer.

Now as to what has all been said in this thread, then I have the
following comments -
COMP (aka BINARY) is stored as big-endian by all
compilers these days.
Therefore there is a penalty on little-endian machines
(or better the OS/firmware for eg. bi-endian) to
byte-swap, operate and re-byteswap results.

The byte swap can be done with a single instruction: BSWAP. GCC has a very
smart
optimizer, yet doesn't seem to be using that instruction. I would inspect
generated code
to see what's going on.

My guess would be it's doing the byte swap in C rather than machine
language. Something
like (((LongNumber&0x000000FF)<<24)+((LongNumber&0x0000FF00)<<8)+
((LongNumber&0x00FF0000)>>8)+((LongNumber&0xFF000000)>>24

This, of course, affects eg. x86(_64).
However, see below

Alignment -
There are in fact not that many alignment tolerant machines there.
Intel x86(_64) and Power PC are known. (The Itanium is not)
This means that any reference to a COMP/COMP-5 item must
be moved to an intermediate area unless it can be proved at compile
time that it is appropiately aligned. (eg. at 01 level)

That's why the original had SYNC on critical integers.

So we have to look at a bisection of the above two attributes.
Generally speaking, for performance, (other than INDEX)
one should use COMP-5 (aka BINARY-LONG SIGNED/UNSIGNED)
for subscripts/counters etc. and define them at the 01 level.

Not only that, a particular compiler implementation has it's
own INDEX definition which is somewhat difficult to ascertain.
(And which is not necessarily a C-5 item)

It's EASY to ascertain. Set a data item of type index to the
table-oriented index,
redefine as comp-5 and display.

Really ?
Ever heard about 64 bit index?
How do you redefine (compatibly) an index ?

Roger


.