Re: MMX speedup for Floyd Steinberg error diffusion



Robert Redelmeier wrote:
rep_movsd <spamtrap@xxxxxxxxxx> wrote in part:
The thing is Floyd-Steinberg dithering works serially
one pixel at a time... Each pixel processed affects the
pixel to its right and the pixels on the next scanline.

Hmm ... this sounds a bit like the game of LIFE .
It has seen some asm optimization that might be reelvant.

Indeed.

I was sure that I had the victory to "Dr Dobbs Second Annual Code Optimization Challenge" all sewn up:

I had very code data compression, I worked on four cells in parallel, and I used the minimum possible (i.e. 3) updates to modify the surrounding cells when anything changed.

David Stafford didn't only beat me, but he did so by a factor of two, by taking advantage of the fact that the 486 target cpu had an 8 KB L1 cache.

My algorithm did less work, and was faster when a lot of cells were alive, but after the first 200 generations or so, David's working set started to fit in L1 cache, and he blew me away from that point on. :-)

Anyway, FS works on the current pixel values, instead of being able to generate a new output bitmap in parallel for all cells/pixels.

On a real many-core cpu it might still make sense to run the algorithm in parallel:

Start core 1 on pixel (1,1). As soon as it finishes pixel (2,1) there will be no further updates to pixel (1,2), so the second core can start on this pixel at this time, right?

When the second core has finished the first two pixels, the third core can start on the third line etc.

In the limit, such an approach can reduce the problem from O(height*width) to O(height+width) as long as you have the resources needed for (height) scan lines, in the form of separate cores*number_of_scanlines_that_fit_in_registers.

Terje
--
- <Terje.Mathisen@xxxxxxxxxxxxx>
"almost all programming can be viewed as an exercise in caching"

.



Relevant Pages

  • Re: Human eye is only about 0.1 MP digicamera - shakened some
    ... So for evolution making an eye is not maybe so huge "impossibility" as ... "The human retina contains about 120 million rod cells ... Then, again, the pixel count is not really a good measure of vision. ... the real value of human vision is our ability to interpret the ...
    (talk.origins)
  • Re: Human eye is only about 0.1 MP digicamera - shakened some
    ... So for evolution making an eye is not maybe so huge "impossibility" as ... ganglion cells, the cells that send their axons to the brain, is more ... independent so that each represents a separate pixel. ... the real value of human vision is our ability to interpret the ...
    (talk.origins)
  • Re: Table Pixel Widths/Heights - better in FP 2003 ?
    ... > Does FP2003 handle the pixel adjustments on table cell widths better ... > things with sizing and a table of 590 pixels often ends up with cells ... > and cells at random times for no apparent reason? ...
    (microsoft.public.frontpage.client)
  • Re: Overlaping of rectangles
    ... pixel up and left than the bottom right point. ... overlapping of lines common to neighbouring cells. ... only one line should be drawn for sides common to adjacent rectangle. ...
    (microsoft.public.vc.mfc)
  • Table Pixel Widths/Heights - better in FP 2003 ?
    ... Does FP2003 handle the pixel adjustments on table cell widths better ... things with sizing and a table of 590 pixels often ends up with cells ... and cells at random times for no apparent reason? ...
    (microsoft.public.frontpage.client)