# Re: I need some guideance regarding parallel processing

*From*: Gordon Sande <g.sande@xxxxxxxxxxxxxxxx>*Date*: Sat, 03 Feb 2007 18:12:29 GMT

On 2007-02-03 13:39:33 -0400, glen herrmannsfeldt <gah@xxxxxxxxxxxxxxxx> said:

Gordon Sande wrote:

(snip)

You provide no hint of why you need to compare ALL pairs.(snip)

If you are doing metric comparisons then a typical algorithm of this

class will have an n log ( n ) setup cost with a log ( n ) cost for

each query. That makes finding the nearest neighbours of all elements

only n log ( n ). Even lousy code will run circles around a highly

polished n^2 code at your sizes.

In at least one post he mentions Euclidean distance. I am pretty

darn sure that there is an O(n log n) algorithm for that, even in

six dimensions. I am not sure if it gets faster or slower in more

dimensions.

The original paper on k-d trees by "Freidman, J. H., Bentley, J. L., and

Finkel, R. A. 1977. An Algorithm for Finding Best Matches in Logarithmic

Expected Time. ACM Trans. Math. Softw. 3, 3 (Sep. 1977), 209-226" was

for Euclidean as well as all the Minkowski metrics. There is lots of

literature since then. I have used k-d trees on ranks under the max

norm (L_infinity) n more than six dimensions so I know it works. The

coefficients grow a bit.

Any effort on optimizing code done before you work on choosing better

algorithms is a total complete waste of time unless you are a salesman

selling machine time. It is hard to be more emphatic on the issue of

getting good algorithms before you try to optimize things.

My quick calculation is that n/log2(n) is about 1,000,000, though it

might be a factor of two or three slower than that, but from centuries

down to minutes or hours is pretty good.

Algorithm change is often much better than just a few orders of

magnitude. It is a story that deserves to be repeated often and loudly.

At 1 microsecond each, n log2 n for n=3.6e7 is 15 minutes. Maybe

another factor of six since he wants the six closest point for each.

-- glen

.

**Follow-Ups**:**Re: I need some guideance regarding parallel processing***From:*Michael Metcalf

**References**:**I need some guideance regarding parallel processing***From:*tripp

**Re: I need some guideance regarding parallel processing***From:*Gordon Sande

**Re: I need some guideance regarding parallel processing***From:*glen herrmannsfeldt

- Prev by Date:
**Re: text-2-binary I/O switch issue** - Next by Date:
**Re: I need some guideance regarding parallel processing** - Previous by thread:
**Re: I need some guideance regarding parallel processing** - Next by thread:
**Re: I need some guideance regarding parallel processing** - Index(es):