Branch-less, loop-less "Move" implementations for MMs
- From: Eric Grange <egrangeNO@xxxxxxxxxxxxxxx>
- Date: Fri, 29 Apr 2005 08:50:20 +0200
Anybody interested in starting a sub-challenge for a "Move" implementation dedicated to memory managers, that would have neither branches nor loops?
I'm thinking of having code generated at runtime (depending on per-CPU and per-size-range templates) to assemble procedures that would be specialized in copying exactly 16, 32, 48, etc. bytes, kinda like
Move16 movaps xmm0, [eax+0] movaps [edx+0], xmm0 ret
Move32 movaps xmm0, [eax+0] movaps xmm1, [eax+16] movaps [edx+0], xmm0 movaps [edx+16], xmm1 ret
For all the MMs with fixed-size blocks arranged in pages/sheets, the relevant Move could be assembled (at an adequately aligned address) and then directly referred in the page management record (via an indirect call, which AFAIK is correctly pipelined). The templates may not be limited to 16-byte alignments, but also cover cases of 8 or 4 byte alignments for the MMs that use these, and would essentially be targeted at the small transfers (loop overhead being negligible when you copy around thousandths of kB).
The benefit would be "optimal" moves with no loop/branch overhead, from a smaller codebase (when compared to manually unrolling the moves), and callpoints with guaranteed alignments. The various segregated blocks MMs have a rather limited variety of block sizes, yet those sizes vary across MMs and when tweaking a MM, so generating code automatically could be helpful.
Work would thus essentially focus on identifying the most efficient instruction patterns for given transfer sizes/CPU combinations.
Eric .
- Follow-Ups:
- Re: Branch-less, loop-less "Move" implementations for MMs
- From: Thorsten Engler [NexusDB]
- Re: Branch-less, loop-less "Move" implementations for MMs
- Prev by Date: Re: Fastcode CompareMem B&V 1.0
- Next by Date: Re: Fastcode Rules
- Previous by thread: Fastcode MM B&V 0.34
- Next by thread: Re: Branch-less, loop-less "Move" implementations for MMs
- Index(es):
Loading