Influence of local code modifications
- From: Tim Frink <plfriko@xxxxxxxx>
- Date: Mon, 31 Mar 2008 07:08:12 +0000 (UTC)
Hi,
Maybe you have some ideas how to cope with this problem:
I'm trying to optimize assembler code for a complex DSP which is an
in-order-issue superscalar processor (integer and load/store
pipeline,i.e. optimally two instructions can be issued on the two
pipelines simultaneously). Before it decodes instructions
(which might be 16- or 32-bit), they are fetched into a 64bit fetch
buffer. Thus, the fetch is aligned to 8byte addresses and in case an
instruction at a branch target goes beyond the 8byte-border
(misalignment), the processor stalls for additional cycles (extra transfer
of control penalties). The DSP supports also an instruction cache which
makes things even more complicated since multiple instructions are read
from the cache and again they might span over multiple lines leading to
extra cycles.
My optimizations deal with moving basic blocks (determined by some
cost functions) from the slow main memory to a small but fast memory
thus allowing a fast access to these particular blocks. However, I
have large problems with the "optimized" code. The moved blocks
benefit from the faster memory but due to the moving the addresses of
the subsequent instructions obviously change. Sometimes it's even
sufficient to add one instruction which modifies the address of the
following code to get significant runtime changes. The reason are new
misaligned jump targets, differently loaded fetch buffers and thus
different filling of the superscalar pipeline which might have a
positive or negative effect on the total program runtime.
Thus, my problem is that I can achieve a local optimization for the
moved blocks but the resulting global influence is not predictable and
might even undo the benefits and even result in a degraded runtime of
the program.
How do compiler developers cope with this problem? Are there any
approaches which allow to predict the influence of a local code
optimization on the global code performance for complex processors?
Regards,
Tim
.
- Prev by Date: Re: Question concerning object-oriented programming
- Next by Date: Re: Question concerning object-oriented programming
- Previous by thread: How to find closest point(s) around a specific point in point cloud?
- Next by thread: ( WWW.SNEAKERS-IN-CHINA.COM ) Nike fusion shoes,air jordan 4 fusion,AJF4 fusions,Jordan 4 plus af1 fusion,nike jordan fusion Wholesale Air jordan 4 x air force 1 fusions
- Index(es):
Relevant Pages
|