Re: FFT Multiplication on Disk?
- From: cri@xxxxxxxx (Richard Harter)
- Date: Fri, 21 Sep 2007 04:45:51 GMT
On Thu, 20 Sep 2007 11:17:09 -0700, mike3 <mike4ty4@xxxxxxxxx>
wrote:
On Sep 20, 9:17 am, c...@xxxxxxxx (Richard Harter) wrote:
On Wed, 19 Sep 2007 23:37:04 -0700, mike3 <mike4...@xxxxxxxxx>
wrote:
On Sep 19, 5:55 pm, Ben Pfaff <b...@xxxxxxxxxxxxxxx> wrote:
mike3 <mike4...@xxxxxxxxx> writes:
However the problem is that this involves three in-place matrix
transpositions of a square matrix. Does anyone know of an
efficient algorithm to do those?
Here's one paper on the topic:http://ieeexplore.ieee.org/Xplore/login.jsp?url=/iel5/8931/28260/0126...
--
"J'avais trouv'e ma religion :
rien ne me parut plus important qu'un livre.
La biblioth`eque, j'y voyais un temple."
--Jean-Paul Sartre
That looks like it's going to cost me a good sum of MONEY
to get that paper. I don't have lots of that nice green stuff on
hand given my current financial situation, unfortunately.
A simple alternative that you can roll on your own is to
recursively partition the transposition. Here is the idea:
Suppose we have a matrix that looks like this:
A B
M = C D
where M is 2Nx2N and A, B, C, and D are all NxN. Transpose each
component separately and then swap A and B. If you do things
soothly the number of cache misses drops sharply.
Richard Harter, c...@xxxxxxxxxxxx://home.tiac.net/~cri,http://www.varinoma.com
But the rhetoric of holistic harmony can generate into a kind of
dotty, Prince Charles-style mysticism. -- Richard Dawkins
The problem though is that it is very non-sequential. Is
this as close to sequential as one can get it, or is there
better? I want to minimize the amount of disk operations,
especially seeks, as much as possible. It's the _disk_
part that's the real chokepoint here.
The point of recursive partition is make transposition close to
sequential (almost) as possible. Think about it. When you swap
B and C you only have n breaks in sequentiality for each of B and
C for a total of 2*n breaks in sequentiality for M. Follow the
recursion and you get O(n*log(n)) breaks in sequentiality and
O((n^2)*log(n)) moves. The scheme increases number of data moves
(which are fast when made sequentially) in exchange for a
decrease in the breaks in sequentiality.
Since the large majority of moves are made sequentially the disk
operations are reduced substantially compared to the naive
algorithm. It probably is pretty close to as good as you can do.
Richard Harter, cri@xxxxxxxx
http://home.tiac.net/~cri, http://www.varinoma.com
But the rhetoric of holistic harmony can generate into a kind of
dotty, Prince Charles-style mysticism. -- Richard Dawkins
.
- Follow-Ups:
- Re: FFT Multiplication on Disk?
- From: mike3
- Re: FFT Multiplication on Disk?
- References:
- FFT Multiplication on Disk?
- From: mike3
- Re: FFT Multiplication on Disk?
- From: Ben Pfaff
- Re: FFT Multiplication on Disk?
- From: mike3
- Re: FFT Multiplication on Disk?
- From: Richard Harter
- Re: FFT Multiplication on Disk?
- From: mike3
- FFT Multiplication on Disk?
- Prev by Date: Re: FFT Multiplication on Disk?
- Next by Date: Re: FFT Multiplication on Disk?
- Previous by thread: Re: FFT Multiplication on Disk?
- Next by thread: Re: FFT Multiplication on Disk?
- Index(es):