Re: The power of parallel computation



D. J. Bernstein wrote:
> Imagine looking at an algorithm literature full of papers competing for
> the best computation times on a 1-tape Turing machine. ``I can sort in
> time n^2 log^3 n!'' ``I can do even better: time n^2 log^2 n!''
>
> What would you think? You'd laugh. Turing-machine price-performance
> ratios have very little to do with real-world price-performance ratios;
> they don't even get the _exponent_ right. Turing-machine optimization
> might be an amusing mathematical exercise, but it's no substitute for
> serious algorithm analysis.

It's also no substitute for analysis of algorithms as fielded on modern
computer architectures, i.e. the impact of: caches, DRAM latency and
throughput, bus latency and throughput, branch predictors, pipelines and
bubbles, data/control dependencies in source code, not to mention
interprocess communication costs, load balancing, synchronization, &c.

Overlooking the costs of implementation is often as foolish as
overlooking the inadequacies of a formal model. The performance of the
best implementation also can differ from the best abstract model by an
order of magnitude.

>
> Well, that's what future algorithm designers are going to think of
> today's RAM-dominated algorithm literature. RAM price-performance ratios
> have very little to do with real-world price-performance ratios; they
> don't even get the _exponent_ right. RAM optimization might be an
> amusing mathematical exercise, but it's no substitute for serious
> algorithm analysis.

OK.

>
> Do you think that RAM _does_ get the exponent right? In particular, do
> you think that parallel computation can't improve the price-performance
> ratio? http://cr.yp.to/talks.html#2005.06.11-1 tackles your confusion
> head-on, and shreds the notion that RAM is a good architecture.

I'd like to see those slides. Unfortunately that URL document's pages
have been split into illegible shards somewhere along the way. Someone
needs to review their web site's contents before they publish. Or
perhaps the reader is assumed to have compartmented insect eyes...

I'd agree that PRAMs are useless, if that's what you're saying. But I'm
doubtful that numerous formulas are going to dispel *my* confusion.

Randy

>
> ---D. J. Bernstein, Associate Professor, Department of Mathematics,
> Statistics, and Computer Science, University of Illinois at Chicago

--
Randy Crawford http://www.ruf.rice.edu/~rand rand AT rice DOT edu
.



Relevant Pages

  • Re: SortMerge avoids thrashing (was: Baileys "two pass" FFT algorithm question)
    ... For the algorithm used, one makes n/2 merge runs, where n is the ... But if you're using RAM, there's no need to bother with that, you ... cache would thrash. ... to fit within the fast cache at all times, ...
    (comp.programming)
  • Re: out of memory
    ... read only the smaller file into a hash. ... the smaller file will fit into RAM. ... Depending upon the sorting algorithm this would be Ologor ... put your relevant data into a database and use ...
    (comp.lang.perl.misc)
  • Re: out of memory
    ... the smaller file will fit into RAM. ... Depending upon the sorting algorithm this would be Ologor ... put your relevant data into a database and use ...
    (comp.lang.perl.misc)
  • Re: The power of parallel computation
    ... > the best computation times on a 1-tape Turing machine. ... > serious algorithm analysis. ... RAM price-performance ratios ...
    (comp.theory)
  • Re: External sorting
    ... Something which cannot fit in RAM. ... you need to have four physical disk drives (or one ... Then the algorithm you want is building up whatever size merge-runs ...
    (comp.programming)