Re: efficiency change by dynamic allocation in microprocessor
- From: "dungglee" <dungglee@xxxxxxxxxxx>
- Date: Wed, 31 Aug 2005 05:32:25 -0500
Thank you for your kind advice.
I have been programming matrix library with fixed size array.
In example, I have made a structure like this:
typedef struct {
int row;
int col;
double M[1000];
} Matrix;
And I have used matrix as follows:
Matrix m_A, m_B, m_C;
...
void function_name(Matrix &A,Matrix &B)
{
...
}
void main(void)
{
...
function_name(&A,&B);
...
}
Of course, I am afraid of increasing of cache miss rate caused by a lot of
matrix structure used (with big -1000- size of array).
Do you mean should I use just one array such as A[100000].
and then I should utilize the adequate portions of array for each matrix
using index technique?
or my mehtod is right?
regards.
J.J.
>This of course depends how often the new/delete is called. If
>temporary objects are created for each matrix operation, the dynamic
>memory handling could easily consume most of the time. The advantage
>with allocating arrays of the correct size is that all elements are
>close together and with cache and/or virtual memory, quite high hit
>rates can be achieved with few cache conflicts.
>
>While a large fixed 2-D array does not have run time overhead.
>However, if only part of each line and column is used, the actual data
>is distributed all around the big array into small segments. This can
>be bad for the cache performance. At least make sure that each
>segment starts at the beginning of a cache lane.
>
>Especially with primitive cache algorithms, such as the single
>associative 1 to 1 mapping, a physical address can only be loaded into
>a specific cache address. If the fixed 2-D array allocation is done
>carelessly, two adjacent lines or columns, which are in wildly
>separate memory segments could map to the same cache address, causing
>frequent reloads and dramatically dropping the performance. Using
>array dimensions that are _not_ a power of 2 usually avoids this
>problem. Anyway, it is important to know how the cache system works on
>a specific processor and design the table allocation accordingly.
>
>One way around this would be to allocate just a big 1-D vector and at
>each matrix reference calculate the actual 1-D element index. In this
>way all the used elements would be close together. For each matrix,
>define a macro
>
>#define A(i,j) a[(i)*a_dim1+(j)]
>
>To compare the performance with fixed 2-D allocation or dynamically
>allocated 2-D, simply redefine
>
>#define A(i,j) a[i][j]
>
>or even
>
>#define A(i,j) this->a[i][j]
>
>
>By defining macros for each matrix,you could always write the program
>nearly in good old FORTRAN style :-) e.g. like
>
> C(1,4) = A(2,4) + B(3,1) ;
>
>After performance tests the best implementation could be selected,
>without making the program hard to read.
>
>Paul
>
>
This message was sent using the comp.arch.embedded web interface on
www.EmbeddedRelated.com
.
- References:
- efficiency change by dynamic allocation in microprocessor
- From: dungglee
- Re: efficiency change by dynamic allocation in microprocessor
- From: Paul Keinanen
- efficiency change by dynamic allocation in microprocessor
- Prev by Date: OS for 8 bits processor
- Next by Date: Re: OS for 8 bits processor
- Previous by thread: Re: efficiency change by dynamic allocation in microprocessor
- Next by thread: How to compare computing power of microcontrollers?
- Index(es):
Relevant Pages
|