Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- From: moi <root@xxxxxxxxxxxxxxxxxxx>
- Date: Wed, 31 Oct 2007 21:54:13 +0100
On Wed, 31 Oct 2007 13:12:16 -0700, Robert Maas, see
http://tinyurl.com/uh3t wrote:
Alert: For weeks, right up through Friday aftermoon, Google Groups was
showing search results only through Oct.13, so I had no way to find more
recent articles that mentionned my name, such as followups to stuff I
previously posted. Just Friday night GG finally enabled me to find a lot
of articles before this:
Date: Fri, 19 Oct 2007 00:45:42 +0200 From: moiSince Friday I've been trying to catch up with backlog. I'm less than 6
<root@xxxxxxxxxxxxxxxxxxx>
days behind at the moment.
The
A B C D
E F G H
matrix can also be transposed by "one-touch-football" this is the
shuffle or permutation as explained on the wiki-page.
A Google search for: one-touch-football transpose matrix doesn't turn up
anything related except this thread itself. I don't know what you mean
by "the wiki-page".
Wiki-page has been referred to somewhere at the start of this (or
another, Mike has started more than one thread on the same subject)
http://en.wikipedia.org/wiki/In-place_matrix_transposition
One-touch-football was a term I invented.
One idea I saw hinted at elsewhere in this thread is to not store the
array in row-major or column-major order at all, but instead to
explicitly store as a hierarchial structure of sub-arrays that are each
just the right size for whatever unit is used for backing store. (For
disk-to-RAM paging, one cylinder per large sub-matrix, and one sector
per small sub-matrix, a three-level hierarchy; For RAM-to-cache paging,
one 'page' per sub-matrix, just a two-level hierarchy.) For example,
let's imagine a grossly simplified case where virtual memory allows four
numeric values per 'page', and you have a 4x4 array to process:
A1 A2 B1 B2
A3 A4 B3 B4
C1 C2 D1 D2
C3 C4 D3 D4
This is almost the Morton addressing.
The upper-left 2x2 sub-matrix is stored in one 'page', etc. So to
transpose, you do (in any random order) these four sub-tasks: - load
original sub-matrix A, transpose locally, store in destination A; - load
original sub-matrix B, transpose locally, store in destination C; - load
original sub-matrix C, transpose locally, store in destination B; - load
original sub-matrix D, transpose locally, store in destination D;
So how do you transpose locally? For this special case you simply swap
lower left and upper right elements. For more general case you have a
two-level loop which traverses the upper-right triangle in parallel with
the lower-left triangle.
So how do you write the toplevel algorithm to perform those four tasks?
For this special case you hardwire the calls in any order you want:
LoadLocalTransposeStore(0,0,0,0);
LoadLocalTransposeStore(0,1,1,0);
LoadLocalTransposeStore(1,0,0,1);
LoadLocalTransposeStore(1,1,1,1);
For more general case, you have a two-level loop which passes first two
indexes straight and second two indexes reversed (k = number of
partitions i.e. total size N divided by number of rows,cols in each
sub-matrix, in example above N=4, k=2).
for (i=0, i<k, i++)
for (j=0, j<k, j++)
LoadLocalTransposeStore(i,j,j,i);
Trivial, huh?
I guess you must have missed the posting with my code-snippet,
which used a similar trivial flip&swap method.
BTW for non-square matrices things become more complicated.
AvK
.
- References:
- Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- From: mike3
- Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- From: moi
- Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- From: mike3
- Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- From: moi
- Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- From: Robert Maas, see http://tinyurl.com/uh3t
- Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- Prev by Date: Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- Previous by thread: Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- Next by thread: Re: SortMerge avoids thrashing (was: Bailey's "two pass" FFT algorithm question)
- Index(es):
Relevant Pages
|