Re: Newbie to Forums, SIMD question

From: arjan de lumens (arjand666_at_hotmail.com)
Date: 04/09/04


Date: Fri, 9 Apr 2004 18:19:51 +0000 (UTC)

Peter Dove wrote:
> One of the things which is taking the time is that I must do a Power
> for every pixel in order to apply a gamma value. This is of course a
> problem because its hard to efficiently mix code with SSE. So my
> question is, is there any way we can combine the code in some way... I
> have pasted an example of the code which is embedded into the main
> loop. because the array of data is so large it makes no sense to make
> 2 passes, one for the power ( R by g by B ) and one for the rest of
> the options. Any ideas?.... hope this makes sense.
>
> Peter
>
> // First I pasre the Delphi Code and then the conversion to ASM
>
> { If red > dblBlackPoint then
> begin
> red := Exp(preGamma * Ln(red)); //Power ( pout, preGamma );
> red := (red - c) * PreLevel;
> end
> else
> red := 0;}
>
>
>
> //Now the asm for the above
> fld dword ptr[val.red];
> fcom dword ptr[dblBlackPoint];
> fstsw ax;
> sahf;
> jbe @SPIsZero;
> fldln2;
> fxch st(1);
> fyl2x;
> fmul dword ptr[preGamma];
> FLDL2E
> FMUL
> FLD ST(0)
> FRNDINT
> FSUB ST(1), ST
> FXCH ST(1)
> F2XM1
> FLD1
> FADD
> FSCALE
> FSTP ST(1)
> fsub dword ptr[c];
> fmul dword ptr[preLevel]
> fstp dword ptr[val.red];
> jmp @RedEnd;
> @SPIsZero:
> xor eax, eax;
> mov [val.red], eax;
> ffree st(0);
> @RedEnd:
>
>
> Hope you can help
>
> Peter
>
>

Color calculations? Then I will make the following assumptions:
  - 'red' is in the range 0.0 to 1.0, never outside
  - 'preGamma' has the same value for a very large number of pixels
  - you can live with 8-12 bits of precision.
If these assumptions are correct, you can use a lookup table, which
should be much faster than either pow, exp/log or series expansions.
To generate the lookup table (C code for a fixed 1001-entry table,
I am not familiar with Delphi):

float GammaLUT[1000+1];
void initGammaLUT( float preGamma )
        {
        for(i=0;i<=1000;i++)
                GammaLUT[i] = pow(i/1000.0, preGamma);
        }

To actually use the table, use the following assembly code:

        fld dword ptr [val.red]
        fmul dword ptr [const_1000] ;; const_1000 = 1000
        fistp dword ptr [temp_val] ;; temporary variable
        mov eax,[temp_val]
        mov eax,[GammaLUT + 4*eax]
        mov [val.red], eax

This should leave you with a value of ~8-10 bits of precision.
If you need more precision, you can make the lookup table larger or
apply interpolation between the entries of the lookup table
(beware: if the table is too large, you will get cache misses when
accessing it, eliminating the performance advantage); if you need
more performance, interleave instructions for several table lookups.
As long as SSE lacks approximate exp and log instructions, this
is probably the fastest available method. However, if you can accept
approximations to the gamma value, you can run sequences such as

(preGamma = 2.0)
        movaps xmm0, [your_color_value]
        mulps xmm0, xmm0
        movaps [your_color_value], xmm0

(preGamma = 2.25)
        movaps xmm0, [your_color_value]
        rsqrtps xmm1, xmm0 ;; xmm1 <= (color ^ -0.5)
        mulps xmm0, xmm0 ;; xmm0 <= (color ^ 2)
        rsqrtps xmm1, xmm1 ;; xmm1 <= (color ^ 0.25)
        mulps xmm0, xmm1 ;; xmm0 <= (color ^ 2.25)
        movaps [your_color_value], xmm0



Relevant Pages

  • Re: Image scaling (sign taken from a Pc Pen)
    ... Min/Maxfunctions) so as to not overflow since each pixel is represented as a byte with a 0-255 range. ... than the main image processing loop need only query the lookup table for the corresponding range adjusted value rather than having ... the method to produce the LUT is increased. ...
    (microsoft.public.vb.winapi.graphics)
  • Re: code optimization help
    ... interpolation lookup table for speed. ... of pixels from the source image that contribute to the corresponding ... contribute to the output pixel. ...
    (comp.soft-sys.matlab)
  • Re: Findmypast query
    ... findmypast subscription would do a lookup for me (I have an Ancestry sub ... Lucy SUTTON b. ... Peter is the one I'm interested in. ...
    (soc.genealogy.britain)
  • Re: using a pixel shader for undistorting an image?
    ... This is what you plan to use as a lookup table ... on the GPU right? ... pixel transforms are given is too computationally intensive to move to ... calibration, now I need to use the results of it to undistort the ...
    (sci.image.processing)
  • Re: Regarding Axes in imshow function
    ... > vertical direction, each pixel represents 3mm. ... If the image is square, ... take a look via "edit axis" to see ... Peter Boettcher ...
    (comp.soft-sys.matlab)