Re: Benchmarking under 16 bits.
- From: Herbert Kleebauer <klee@xxxxxxxxx>
- Date: Sun, 15 May 2005 00:06:32 +0200
randyhyde@xxxxxxxxxxxxx wrote:
> Herbert Kleebauer wrote:
> Even if it were the best way to learn assembly, it's not the best way
> to benchmark code. The execution environment of 16-bit segments is
> different than 32-bit segments and I would be very careful about
> measuring speed under one system and trying to apply those numbers to
> the other.
Why should I "try to apply those numbers to the other"? All I
said is, that when I execute two inc.b instruction in a loop
10^9 times, then the execution time for "inc.b b0, inc.b b1"
is much bigger than for "inc.b b0, inc.b b4". The loop overhead
is different in 16 and 32 bit code because of the 32 bit opcode
prefix necessary in 16 bit code and because of the 32 bit address,
the inc.b instruction in 32 bit code is longer, which could affect
the execution speed. But this doesn't change anything on the
statement: the execution time for "inc.b b0, inc.b b1"
is much bigger than for "inc.b b0, inc.b b4".
> > It uses self modifying code to make sure, that the code is exactly
> > on the same position when executed with "inc b0 / inc b1" and
> > "inc b0 / inc b4" so the result is not affected by code alignment.
> > But that has nothing to do with 16 bit code, I do the same in the
> > 32 bit code below.
>
> Or, you could just run the program twice, with the opcodes changed.
Maybe, but you can't be sure that the program is loaded at exactly the
same memory location which could affect execution time, so I prefer
self modifying code.
> > It was executed in a DOS box in Win98 (but the obsolete program
> > will also run in XP).
>
> Using VM86? The environment is *different*. While I won't claim that
> this *will* produce different results, I certainly wouldn't claim that
> executing 16-bit code (that is, code running in a 16-bit segment model
> under VM) is going to produce identical results to native 32-bit code.
Nobody said that it produces "identical" results. But when in
a V86 program "inc.b b0, inc.b b1" is much slower than
"inc.b b0, inc.b b4", than this also is true for 32 bit code.
> > As 32 bit program I get about 8.8 s for "inc b0 / inc b1" and
> > about 5,5 s for "inc b0 / inc b4" (but the variance between
> > different runs is bigger than with the DOS version).
>
> Is that in a 16-bit segment, or a 32-bit segment?
If it would use a 16 bit segment, then it would be a 16 bit
program (using 32 bit operand size and 32 bit addressing modes).
It is a normal 32 bit Windows PE console program.
> > The 32 bit source is nearly the same as the 16 bit source:
>
> But in a 16-bit segment, we know that the extra prefix bytes hurt you.
> If you simply claim "These are the results I get on processor XYZ when
> running in a 16-bit segment under virtual-86 mode" I wouldn't even
> *start* to question your numbers. I seriously doubt any modern
> processor attempts to optimize execution speed in virtual-86, 16-bit
You said: If you can, you might try the same experiment under
Windows in 32-bit protected mode on the same system.
I posted the source code and the results for such a program so
I don't understand your "But in a 16-bit segment, we know that
the extra prefix bytes hurt you".
As I already said, there maybe is a little difference in the
execution speed of a "inc.b [memory address]" in 16/32 bit code
because of the different instruction size in 16/32 bit code. But
this doesn't affect the statement: "inc.b b0, inc.b b1" is much
slower than "inc.b b0, inc.b b4" (as you also can see in the posted
result for the 32 bit Windows program).
As I also already said, this AMD processor is the only one where
I got such an effect. But nevertheless this means, that it can
be useful to align byte variables at dword addresses (wasting
three bytes of memory), not because it is faster to access
dword aligned byte variables, but because it is faster to access
two bytes which are not in the same dword.
.
- Follow-Ups:
- Re: Benchmarking under 16 bits.
- From: Elohim Meth
- Re: Benchmarking under 16 bits.
- References:
- Re: RosAsm Team is Still Making Excuses
- From: randyhyde
- Re: RosAsm Team is Still Making Excuses
- From: wolfgang kern
- Re: RosAsm Team is Still Making Excuses
- From: randyhyde
- Re: RosAsm Team is Still Making Excuses
- From: wolfgang kern
- Re: RosAsm Team is Still Making Excuses
- From: randyhyde
- Re: RosAsm Team is Still Making Excuses
- From: Frank Kotler
- Re: RosAsm Team is Still Making Excuses
- From: Herbert Kleebauer
- Re: RosAsm Team is Still Making Excuses
- From: Beth
- Re: RosAsm Team is Still Making Excuses
- From: Herbert Kleebauer
- Re: RosAsm Team is Still Making Excuses
- From: Herbert Kleebauer
- Re: RosAsm Team is Still Making Excuses
- From: randyhyde
- Re: RosAsm Team is Still Making Excuses
- From: Herbert Kleebauer
- Benchmarking under 16 bits.
- From: randyhyde
- Re: RosAsm Team is Still Making Excuses
- Prev by Date: Re: Bitwise Operator (was: Early fruits of my labour)
- Next by Date: create a file
- Previous by thread: Benchmarking under 16 bits.
- Next by thread: Re: Benchmarking under 16 bits.
- Index(es):
Relevant Pages
|
|