Re: can multi-core improve single funciton?



On 10 Feb, 09:45, Steven D'Aprano
<ste...@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
On Tue, 10 Feb 2009 18:18:23 +1000, Gerhard Weis wrote:
Once I have seen Haskell code, that ran fibonacci on a 4 core system.

The algorithm itself basically used an extra granularity paramet until
which new threads will be sparked. e.g. fib(40, 36) means, calculate
fib(40) and spark threads until n=36. 1. divide: fib(n-1), fib(n-2)
2. divide: fib(n-2), fib(n-3)
3. divide: fib(n-3), fib(n-4)

We tried this code on a 4 core machine using only 1 core and all 4
cores. 1 core wallclock: ~10s
4 core wallclock: ~3s

Three seconds to calculate fib(40) is terrible. Well, it's a great saving
over ten seconds, but it's still horribly, horribly slow.

Check this out, on a two-core 2.2 GHz system:

import timeit
timeit.Timer('fib(40)', 'from __main__ import fib').timeit(1)

7.2956085205078125e-05

That's five orders of magnitude faster. How did I do it? I used a better
algorithm.

def fib(n, a=1, b=1):
    if n==0:
        return a
    elif n==1:
        return b
    return fib(n-1, b, a+b)

And for the record:

fib(500)

225591516161936330872512695036072072046011324913758190588638866418474627738686883405015987052796968498626L

According to the common definition of fibonacci numbers fib(0) = 0, fib
(1) = 1 and fib(n) = fib(n-1) + fib(n-2) for n > 1. So the number
above is fib(501).

timeit.Timer('fib(500)', 'from __main__ import fib').timeit(1)

0.00083398818969726562

And now for my version (which admitedly isn't really mine, and returns
slightly incorrect fib(n) for large values of n, due to the limited
floating point precision).

import math
sqrt5 = math.sqrt(5)
golden_section = (sqrt5 + 1) / 2
def fib(n):
return int(golden_section ** n / sqrt5 + 0.5)

fib(501)
225591516161940121919323945317755919750165306733355143970858461525064153963081278412888159063487104942080L

timeit.Timer('fib(501)', 'from __main__ import fib').timeit(1)
1.9555558083084179e-05


Less than a millisecond, versus millions of years for the OP's algorithm.
I know which I would choose. Faster hardware can only go so far in hiding
the effect of poor algorithms.


I agree 100%

/Niklas Norrthon
.



Relevant Pages

  • Re: no lcm in standard library?
    ... which are comparable to the AMD K6 architecture, branch misprediction ... mispredict penalties tilt the algorithm choice in favor of Euclid's ... This implements the binary GCD algorithm which removes the branch ... that it approximates the divide as a 1-bit accurate divide, ...
    (comp.lang.c)
  • Re: rules of fixed-point arithmetic
    ... divide to keep everything fast. ... or you could even use one of the old bit shift divide ... used a good old bit shift algorithm. ... rather than table lookup, perhaps with simple correction factors / ...
    (comp.arch)
  • Re: returning none when it should be returning a list?
    ... It divide next by the list of previous prime numbers if next ... is not bigger than lim, count up all the times where it's not evenly ... Your basic algorithm as described above sounds workable, ...
    (comp.lang.python)
  • Re: returning none when it should be returning a list?
    ... It divide next by the list of previous ... Your basic algorithm as described above sounds workable, ... the Sieve of Eratosthenesis a relatively simple ... If you want to look further into the subject, see the Primality Test ...
    (comp.lang.python)