Re: Optimizing Assembler Code
- From: Frank <spamtrap@xxxxxxxxxx>
- Date: Mon, 28 Jul 2008 07:53:00 -0700 (PDT)
On 14 Jun., 15:02, Frank <spamt...@xxxxxxxxxx> wrote:
The problem is that when you have to access a variable you are
stalling
the instruction fetch pipeline and the speculative execution of modern
CPU's.
If you have a case which is mapped to binary search jumps (and only
larger
switcher with dense labels are mapped into jump tables) then all this
modern
features are working well.
What does this mean: "you are stalling the fetch pipeline"?
and what are the modern features that apply when one does
a binary search jump? Could you give me a reference or elaborate on
this?
Thanks a lot.
Frank
On 14 Jun., 20:02, scholz.lothar <spamt...@xxxxxxxxxx> wrote:
On 14 Jun., 15:02, Frank <spamt...@xxxxxxxxxx> wrote:
using 'computed goto' (goto *variable) is slower than routing via
a switch statement (switch(variabel) { case 1: goto LABLE_1; ... })
I would need some support by someone who has excessive assembler
experience, in order to understand what it going on.
How many labels do you have in the switch statement?
The problem is that when you have to access a variable you are
stalling
the instruction fetch pipeline and the speculative execution of modern
CPU's.
If you have a case which is mapped to binary search jumps (and only
larger
switcher with dense labels are mapped into jump tables) then all this
modern
features are working well.
This is one of the reasons for SmartEiffel to be faster then C++. It's
using
unfolded if's instead of virtual method tables for virtual function
calls.
Remember that in the time of a secondary cache level miss you have
200
cycles to wait and can do 400-600 operations. Thats a lot.
For everything else use VTune to got into details.
.
- Follow-Ups:
- Re: Optimizing Assembler Code
- From: Harold Aptroot
- Re: Optimizing Assembler Code
- Prev by Date: Get the FAQs
- Next by Date: disassembly of Debug.exe?
- Previous by thread: Disassembler questions
- Next by thread: Re: Optimizing Assembler Code
- Index(es):