Re: greatest multiple algorithm



"Rod Pemberton" <spamtrap@xxxxxxxxxx> writes:
"Phil Carmody" <thefatphil_demunged@xxxxxxxxxxx> wrote in message
I always advise a sieve.

Why?

Because I'm interested in speed. And I'm always talking in
terms of 10^12-10^16. Anything that's just 10^9 or 10^10
is just a toy and not of interest to me.

As I've pointed out, each has advantages and limitations. For
integers upto the largest cpu supported integer type, modulo is faster on
modern machines.

In practice, it seems that the pure bit sieve (although using
Bernstein or Galway's quadratic form (Atkin) sieve rather than
Eratosthenes) should remain optimal until well beyond 10^20,
and after that trial division makes no sense at all.

This could be used to reduce the time needed to setup the
lower or smaller primes in bit sieve. The bit sieve becomes progressively
slower with larger sieve sizes due to the large number of memory accesses.
That can be reduced by a significant factor using the technique
documented on TOS's web page. Basically you throw the primes into
a simplified priority queue. Only look at the primes that are
relevant to each segment you're sieving at the time. (James -
did yours do this too - I've not looked at your code for a long
time, and barely speak ForTran.)

However, the bit sieve has the advantage that it isn't limited to finding
primes that fit within the cpu supported integer type.

bit sieve
using single bits to represent odd numbers
finds primes upto memory_in_bytes*8*2
requires no modulo computation
requires many memory accesses
slows down based on size memory used
limited to available memory

You can use a wheel to decrease the memory requirements. Most
fast sieves use the 8 residues mod 30 as their wheel. That
almost halves the memory requirements. Even a wheel of 6 saves
you a third.

One question is how much does the extra logic needed to skip bits slow down
the algorithm versus the gain in computable range. For example, I found
that skipping multiples of 5 for the trial division algortihm below ("wheel
of 30" ?)

Yup.

slowed the algorithm down alot.

It's possible. It takes a lot of care to use a mod 30 wheel
effectively. TOS and DJB have done it, I almost never care
beyond the trivial mod 6 wheel.

The comparison and conditional
were much slower than the modulo.

Often efficient wheels are implemented by unrolling loops,
and creating 8 different versions of the code for the 8
different stepping patterns (James' and TOS's optimised
code does this, I'm sure (and it doesn't apply to an Atkin
sieve such as DJB's)).

[...]
Forgot to mention another one:

test potential primes with GCD composed of all lower primes
modulo with "primed" GCD
very fast
"primed" GCD exceeds cpu integer size rapidly (i.e., prime 47...)

Hmm, maybe I should try floats on that one just to see how far I can get it
to run without error...


That technique goes all the way back to Euclid. It's very useful
in many situations. If you look in the GMP source for their
probable primality test you'll see them do a couple of GCDs with
products of primes before anything else.

Phil
--
"Home taping is killing big business profits. We left this side blank
so you can help." -- Dead Kennedys, written upon the B-side of tapes of
/In God We Trust, Inc./.

.



Relevant Pages

  • Re: greatest multiple algorithm
    ... Bernstein or Galway's quadratic form sieve rather than ... lower or smaller primes in bit sieve. ... slows down based on size memory used ... You can use a wheel to decrease the memory requirements. ...
    (comp.lang.asm.x86)
  • Re: greatest multiple algorithm
    ... array stores found primes ... modulo with found primes in array to find new primes ... consumes large amounts of memory ... That doesn't sound like a sieve, ...
    (comp.lang.asm.x86)
  • Re: [Clax86list] Can WXP handle 4 GB of Physical Memory
    ... determine if huge chunks of memory are able to speed up a sieve ... using L1 cache. ... Since there are about TEN times as many primes between ... sieve checks for primes up to 10**18. ...
    (comp.lang.asm.x86)
  • Re: greatest multiple algorithm
    ... modulo with found primes in array to find new primes ... That doesn't sound like a sieve, ... integers upto the largest cpu supported integer type, modulo is faster on ... slower with larger sieve sizes due to the large number of memory accesses. ...
    (comp.lang.asm.x86)
  • Re: Programs ro evaluate speed and size of L1 / L2 Cache
    ... Sieve the 8 different totients independently. ... a comparison for the main Atkin sieve (which has wheel 60). ... you could run a few hundred small primes over the 1 ... I tried the Vista speech recognition by running the tutorial. ...
    (comp.lang.asm.x86)

Loading