Re: Any progress yet? (was Re: Fast pi program?)
- From: mike3 <mike4ty4@xxxxxxxxx>
- Date: Thu, 30 Aug 2007 23:18:49 -0700
On Aug 30, 11:49 pm, "Alf P. Steinbach" <al...@xxxxxxxx> wrote:
* mike3:
On Aug 30, 7:11 pm, "Alf P. Steinbach" <al...@xxxxxxxx> wrote:
...
...http://www.mediafire.com/download.php?5ryt5g4mcyo
I clicked on the URL above, and it produced
"The quickkey you provided for file download was invalid. This is
usually caused because the file is no longer stored on Mediafire. This
occurs when the file is removed by the originating user or Mediafire."
Darn, I posted the wrong link!!!
This is what it should be:
http://www.mediafire.com/download.php?9mzltzjyizn
Try that one.
Anyway, the way to optimize programs is (1) measure, (2) if optimization
is then deemed necessary, first think about using better algorithms
(this includes e.g. changing in-place instead of copying and creating
values), and only secondly if at all, micro-optimizations such as trying
to make loops fit in processor cache, thinking about page loading (you
don't want the program to hop all over memory, but more localized and
preferentially address-sequential access), and so on.
Well, I measured it using GNU's profiler for a 1M digit run, and found
out
much of the time was going into the long multiplication area of the
program, where all the Fast Fourier transform/Number theoretic
transform
stuff is. I've tried going over that several times and haven't had
much
luck yet. Is there any way to cut down perhaps on the number or
size of multiplications required? Or perhaps to optimize the
multiplication
routines themselves, including the FFT/NTT code? What would you
suggest, based on what you can see from the source code?
Well, you're into an area where I'm effectively incompetent.
Hmm, so then maybe someone else here could offer some advice,
then?
I'd had to really delve into this to give specific advice (and Things Take Time).
For now, looking only at the [pib26.c] main program source, it seems the
program is actually /calculating/ in base 26^4. Perhaps that's a choice
found after measuring the cost of calculating in a binary base and then
converting to base 26, or perhaps it's not? If number system conversion
requires quadratic time (does it?) perhaps this is best, but if not,
perhaps a change to internal binary system computation, with final
conversion to base 26, would really speed things up? Just a wild guess.
It does calculate in base 26^4. You are right. To compute in base 10
you need to compute 2x the digits then convert to 26 (since the
algorithms work best with powers of 2 numeric sizes), which
effectively cancels much of the speed gain. Plus that final
radix conversion is _not_ cheap, you know.
Cheers, & hth.,
- Alf
--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
.
- Follow-Ups:
- Re: Any progress yet? (was Re: Fast pi program?)
- From: CBFalconer
- Re: Any progress yet? (was Re: Fast pi program?)
- From: Alf P. Steinbach
- Re: Any progress yet? (was Re: Fast pi program?)
- References:
- Fast pi program?
- From: mike3
- Any progress yet? (was Re: Fast pi program?)
- From: mike3
- Re: Any progress yet? (was Re: Fast pi program?)
- From: Alf P. Steinbach
- Re: Any progress yet? (was Re: Fast pi program?)
- From: mike3
- Re: Any progress yet? (was Re: Fast pi program?)
- From: Alf P. Steinbach
- Fast pi program?
- Prev by Date: Re: Any progress yet? (was Re: Fast pi program?)
- Next by Date: Re: Any progress yet? (was Re: Fast pi program?)
- Previous by thread: Re: Any progress yet? (was Re: Fast pi program?)
- Next by thread: Re: Any progress yet? (was Re: Fast pi program?)
- Index(es):