Re: MMX speedup for Floyd Steinberg error diffusion



rep_movsd wrote:
On May 8, 7:00 pm, Terje Mathisen <spamt...@xxxxxxxxxx> wrote:
How does it handle the corner/edge cases?

The edges are handled by ignoring all pixels which do not have the
neighbours, the simplest way I found was to keep my x and y loops
ranging over [1, width-1) and [0, height-1).

OK. Quick & dirty indeed.

It seems to me like the getNearestPalColor() function could easily
dominate the processing time here!

Well Initially the code used the standard distance matching algorithm
as seen in zillions of color reduction libraries :

For each color in the palette
Compute the RGB colorspace distance between the palette entry and the
target color -> (r - rp)^2 + (g - gp)^2 + (b - bp)^2
Choose the index with the minimum distance.

This was pathetically slow... It has to loop 256 times for each pixel
doing several arithmetic ops!
One optimization is to use abs and get rid of the multiplications.

But that messes up the distance calculation because it gives the wrong answer for many edge cases. OTOH, the error diffusion would tend to distribute this error across the neighboring pixels, making the visual impact much less.


Then I thought of a simplistic algorithm based on a precomputed
table...

I start with a 3d array 256 * 256 * 256, initialized to -1

Ouch! A 16 MB table is NOT something I like to throw around, and the initialization time for the table is very likely to drown the processing time unless you have a _lot_ of images with the same palette.

(You did write something about video to multi-page GIF, in which case you _would_ have a lot of pages. :-)

Later I changed the box to 32x32x32 and maximal box size to 8 and
still get good results.

That's a 32K table, something that does fit nicely in L2 at least, and significant parts could go into L1 as well.

To do a mathematically perfect job I would start with a 5:5:5 (32 K) or 4:4:4 (4 K) table, with each entry being the start of a list of all the possible palette entries for that entry:

By making each entry 16 bits wide, I could reserve the bottom 256 entries for colors that give a single possible result, and the remaining values would be the start of pascal-style lists of possible entries, stored as offsets into the same byte array.

I.e:

uint16_t palLookup[];
byte palLists[];

byte findPalColor(byte r, byte g, byte b)
{
unsigned index = (r >> 4) | (g & 0xf0) | ((b & 0xf0) << 4);
unsigned entry = palLookup[index];
if (entry < 256) return entry;

/* More than one candidate, check them all! */
byte *list = palLists + entry - 256;
int mindistance = 0x7fffffff;
unsigned count = *list++;
unsigned bestEntry;
do {
unsigned palCandidate = *list++;
int r_err = palette[palCandiate].r - r;
int g_err = palette[palCandiate].g - g;
int b_err = palette[palCandiate].b - b;

int distance = r_err*r_err + g_err*g_err + b_err*b_err;
if (distance < minDistance) {
minDistance = distance;
bestEntry = palCandidate;
}
} while (--count);
return bestEntry;
}

The reasoning is that Floyd-Steinberg dithering perturbs the colors
usually by a small amount, so it would be unlikely to look for a color
thats way far away from any given palette entry.
It can happen that the lookup table doesnt have an entry at some given
point, in such cases I fall back on the old linear algorithm.

If you fill in the holes with the values you determine this way, then your cache hit rates will tend to increase, right?

Terje
--
- <Terje.Mathisen@xxxxxxxxxxxxx>
"almost all programming can be viewed as an exercise in caching"

.



Relevant Pages

  • Re: Problems with last entry in Color Pallette of Bitmap
    ... I have problems with the last palette entry in a BMP i am creating. ... Want i do is break a Geotiff file up in lots of smaller BMP's, ... I used GetPaletteEntries with the changed Bitmap to compare the values ...
    (comp.lang.pascal.delphi.misc)
  • Re: nearest colour and thanks for the palette help
    ... But if you have only 256 colors in palette then you ... solution is to construct the Voronoi diagram for the palette ... A Voronoi region for a palette entry is the set of all points ... the exhaustive search was gawd-awful slow. ...
    (comp.graphics.algorithms)
  • Re: Man arrested and locked up for five hours after taking photo of police van ignoring no entry sig
    ... There was a definite limit on the distance involved and perhaps there ... including the no entry signs. ... I have since read that the Police were attending the chippy to view CCTV ... were using the car for Police purposes and, under the right circumstances, ...
    (uk.transport)
  • Problems with last entry in Color Pallette of Bitmap
    ... I have problems with the last palette entry in a BMP i am creating. ... Want i do is break a Geotiff file up in lots of smaller BMP's, ... I used GetPaletteEntries with the changed Bitmap to compare the values ...
    (comp.lang.pascal.delphi.misc)
  • 256 color palette problems!
    ... I have problems with the last palette entry in a BMP i am creating. ... Want i do is break a Geotiff file up in lots of smaller BMP's, ... I used GetPaletteEntries with the changed Bitmap to compare the values ...
    (comp.lang.pascal.borland)