Re: optimizing simple-arrays



Dave Watson <djwatson@xxxxxxxxxxxxx> writes:

> On 2005-04-06, Christophe Rhodes <csr21@xxxxxxxxx> wrote:
>> Meanwhile, are you sure that packing your floats into your array is a
>> relevant bottleneck in your program? It's something that smacks of an
>> initializtion, which is a one-time cost that largely evaporates on
>> real computations... also, I'd expect your get-floatN routines to
>> take longer than a couple of memory indirections.
>
> Actually the full routine is an rgb to hsv pixel converter:
>
> (declaim (inline rgb-pto-hsv)
> (ftype (function (&key ) (simple-array (float) (3))) rgb-pto-hsv))
> (defun rgb-pto-hsv (pixel &key (maxvalue 255))
> "transforms a pixel in rgb to hsv (vector triple"
> (declare #.*optimize*
> (type (simple-array (unsigned-byte 8) (3)) pixel)
> (type (unsigned-byte 8) maxvalue)
> (sb-ext:muffle-conditions sb-ext:compiler-note))
> (symbol-macrolet ((R (aref pixel 0)) (G (aref pixel 1)) (B (aref pixel 2)))
> (let ((ret (make-array 3 :element-type 'single-float)))
> (setf (aref ret 0) (* maxvalue (the single-float
> (cos (the (single-float -1.0 1.0)
> (safe/
> (* .5 (+ (- R G) (- R B)))
> (sqrt (+ (expt (- R G) 2) (* (- R B) (- G B))))))))))
> (setf (aref ret 1) (* maxvalue (- 1 (the single-float (float
> (safe/
> (* 3 (min R G B))
> (+ R G B)))))))
> (setf (aref ret 2) (* .333333 (+ R G B)))
> ret)))
>
> (just for context)

What is SAFE/? If this function is not known to the compiler, it
won't know if it can compile FLOAT of it efficiently.

Also, why SYMBOL-MACROLET, rather than a simple LET? The system will
have to perform the array lookup at each reference, which is unlikely
to be a win.

You might do better, also, to have maxvalue (and 1!) as single floats
explicitly.

(While I'm at it, your FTYPE declaration is wrong, both in the lambda
list and the return value declarations)

> I've profiled to find that this (and the caller) are the bottlenecks,
> but I didn't know if there was a way to find exactly what in the
> function was the exact bottleneck, so I just optimized the whole thing
> the best I could because it was small.

I'm not really an expert in timings, but I would have expected
floating point operations to dwarf memory accesses on functions of the
above form. Still, let's see the support routines for this piece of
code, and maybe something else will leap out.

Christophe
.



Relevant Pages