Re: MMX speedup for Floyd Steinberg error diffusion



Hi

In the meantime, the diffuse function turned into the fastest of the 5
functions that are called for each frame.

Now I went on to optimize my octree quantizer.
It consists of adding each pixel color to an octree and pruning it
everytime the number of leaves get > than the number of desired
colors.

Since each pixel involves walking down the tree until its leaf is
reached, i reasoned that it may make sense to process multiple pixels
of the same color at a time (i. e. sort the image beforehand, so the
pixels are clustered ).

However even with a radix sort with highly unrolled loops, the
overhead isnt worth the benefit.
I tried using a hash map, gathering pixel color counts for each
scanline, but still too slow.

Amazingly, the following :

ULONG clr = *pDIB, cnt = 0;
for(int y = 0; y < uHeight; ++y)
{
for(int x = 0; x < uWidth; ++x)
{
if(*pDIB == clr)
{
++cnt;
}
else
{
addColor(clr, cnt);
cnt = 1;
clr = *pDIB;
}
++pDIB;
}

while(m_uLeafCount > m_uMaxColors)
reduceTree();
}

is faster than the following

ULONG clr = *pDIB, cnt = 0;
for(int y = 0; y < uHeight; ++y)
{
for(int x = 0; x < uWidth; ++x)
{
addColor(clr, *pDIB);
++pDIB;
}
while(m_uLeafCount > m_uMaxColors)
reduceTree();
}

even without the input array being sorted!!!
I can only surmise that nearby pixels tend to be equal many a time....

Is there any algorithm that can bring similar values in an array
together, but isnt as intensive as a complete sorting?
Any blazingly fast histogram algorithm can help here.

Vivek

.



Relevant Pages

  • Re: Colors -- What Am I Missing
    ... With a pure white background the line shows up LtBlue. ... > a) You could read the current pixel color at the point, ... > specific point from the array, store the current pixel color there to the ...
    (microsoft.public.vb.winapi)
  • Re: Ray tracing transformations
    ... After all the subpixels for a pixel have been determined some ... filter is applied and the final pixel color is determined. ... I'm not sure what you mean by the merging of the framebuffers. ... you can generate primary rays directly ...
    (comp.graphics.rendering.raytracing)
  • Re: High speed differential to single ended
    ... roughly 150 million pixel per second. ... Inside the FPGA, the pixel color is represented as 8 bits in parallel, ... I/O, therefore, as I claimed, only dedicated inputs can handle such a ... high bitrate of 1.5 Gbps. ...
    (comp.arch.fpga)
  • Re: High speed differential to single ended
    ... roughly 150 million pixel per second. ... Inside the FPGA, the pixel color is represented as 8 bits in parallel, ... I/O, therefore, as I claimed, only dedicated inputs can handle such a ... high bitrate of 1.5 Gbps. ...
    (comp.arch.fpga)