Re: Is intrinsic MIN inefficient?

From: James Giles (jamesgiles_at_worldnet.att.net)
Date: 05/12/04


Date: Wed, 12 May 2004 17:15:58 GMT

Steven G. Kargl wrote:
>> It is a summation of terms like
>>
>> (array(a,b) - array(a,b-1))**2
>>
>> with "b" and "b-1" changing slightly within the terms of the series. The
>> "a" and "b" arguments are unaffected by the limits of the DO loop (i.e.
>> DO i=c,d).
>
>
> How large is "a"? If it is sufficiently large, you will
> flush the memory cache for each subtraction operation.
> The elements array(a,b) and array(a,b-1) are stored
> "a" * sizeof(array(a,b)) bytes apart in memory.

Actually, the memory locations are not specified by the
language (though the requirements of storage association,
if it occurs, usually impose the same memory layout for
all implementations). Given the usual choice of storage
order, the elements array(a,b) and array(a,b-1) are
SIZE(array,1) numeric storage units apart, assuming the
data type is default REAL. The variable "a" is presumably
just an integer that's within the bounds of the first dimension
of ARRAY. It's specific value is irrelevant to the calculation
of how far apart the two elements are.

Not also that the efficiency may not be all that adversely
effected (affected? I never get those straight) by the distance
between thos two elements. It depends on how "a" and "b"
vary with "i" in the loop. For example:

   DO i=start,end
      result(i)=MIN(array(i,j)-array(i,j-1)**2, ..., 1d6)
   END DO

Here, "a" and "b" are "i" and "j" respectively. But, the cache
will load up several elements of array beginning at array(i,j)
*and* beginning at array(i,j-1). So, the program will still be
reasonably efficient. However, if the loop is:

   DO i=start,end
      result(i)=MIN(array(j,i)-array(j,i-1)**2, ..., 1d6)
   END DO

You are now cutting "across the grain" of the array. The cache
will indeed be a problem (unless it has the ability to load with
stride).

I suspect the original problem actually has entries in the MIN
function of both the above configurations. He'd probably be
better off with several separate loops (recoded as whole array
operations):

      result(start:end)=array(start:end,j)-array(start:end,j-1)**2
      result(start:end)=MIN(array(j,start:end)-array(j,start-1:end-1)**2,
result(start:end))
   ...
      result(start:end)=MIN(result(start:end), 1d6)

Now the cache has less load on it and at least *some* of the
expressions may not be "across the grain".

-- 
J. Giles


Relevant Pages

  • Re: Fast linked list
    ... > amounts of memory rather than huge chunks of it. ... random insertions into a vector/dynamic array are not as slow ... to cause a cache miss. ... some hard numbers on speed differences between lists and arrays. ...
    (microsoft.public.vc.mfc)
  • Re: Fast linked list
    ... > amounts of memory rather than huge chunks of it. ... random insertions into a vector/dynamic array are not as slow ... to cause a cache miss. ... some hard numbers on speed differences between lists and arrays. ...
    (microsoft.public.vc.language)
  • Re: Drastic slow-down in the execution time of my program
    ... calculations are done on this array. ... No extra memory is allocated. ... I think you might be on to something with the cache idea. ... The CPU is a Pentium 4 2.8GHz. ...
    (microsoft.public.dotnet.languages.vc)
  • Re: cache and smls
    ... >array * the number of loops through the array operations is a ... >it "falling out" of the cache. ... or that it has a completely different memory setup ... It's a good bet you didn't have it on the Intel chip either. ...
    (comp.lang.functional)
  • Re: Cached memory never gets released
    ... Stock linux 2.4.26 kernel. ... Due to flash bug 3M of memory gets lost due to font memory getting lost ... The output of "free" cache number steadily grows. ... longer to exhaust all of system memory with the cache. ...
    (Linux-Kernel)