Re: Problem: Creating a raw binary string
From: Jim P (Jim_P_at_mad.scientist.com)
Date: Tue, 30 Nov 2004 11:48:43 -0800
Bruce Roberts wrote:
> "Nicholas Sherlock" <email@example.com> wrote in message
>>>I'm still trying to understand why 64-bit cpu will be any faster than
>>>for the average desktop o/s and applications. IMO its a canard.
>>AFAICS, they can push double the data with one instruction, so
>>applications written to take advantage of that fact can nearly halve
>>their execution time.
> While its true that a 64-bit cpu will move twice the data per instruction it
> doesn't mean that programs can really benefit substantially from this.
> Memory bus width plays an important role here and unless it too is widened /
> made faster, doubling the required data movement per instruction is actually
> going to result in a performance loss. I doubt that 64-bit integers are
> really going to be needed by the vast majority of programs. So programmers
> either stick to 32-bit integers, thus raising all sorts of word alignment
> issues or use 64-bit integers, effectively doubling the amount of memory
> being moved without actually using it. String work, a significant part of
> most business apps, work with 8 or 16 bit chunks, so these operations are
> not likely going to realize any benefit from the increased size.
> There are application areas that will benefit from 64-bit. I just don't
> think that this move is going to contribute to a significant improvement for
> all, or even most, applications.
> I don't think that one can take the experience of moves from 8-bit to 16-bit
> to 32-bit as a guide to probably gains moving to 64-bit. In those moves the
> cpu and bus architectures were expanding to better represent work being done
> in software. IOW programs routinely used 32-bit data types even though they
> had to provide code to process these types on more limited hardware. This
> situation doesn't exist as a general case today for 64-bit, i.e. most
> programs do not make heavy use of 64-bit data types. Although as I wrote the
> last sentence it occurred to me that Currency is 64-bit and some business
> apps probably use the type quite heavily. Still, I suspect that 64-bit speed
> gains are more likely to be single digit percentages rather than in the
> 30-60% range.
This is a great arguement - except that is not how the memory is
You are forgetting the two levels of cache in the processor. The
smaller Level 1 cache and then the larger 500K or larger Level 2 cache.
and the memory operations are typically based upon cache operations and
not upon processor requirements. Thus the reading and writing of a
Integer 32 bit or 64 bit is done to the cache and not to memory.
The cache operations are handled by the memory controller and are
typically done in blocks and take advantage of the structure of the
memory chips and performance features they provide by none standard
modes of data fetching.
and the caching operation does some assuming that the next set of bytes
are going to be needed also. (as in program code) or handling an array.
So the memory operation might be as much as 64 bytes for each operation - -
again note this is different than the processor read and write which is
to the level 1 cache and the cache handler then looks to see if it is
there and if not cause a look in the level 2 cache and then finally
empties a block in the cache for fetching the memory block.
Once this block of is in the cache and in effect on the processor chip -
the rules change.
I am not going to go into the details here as they keep playing
different cames to speed up the processor and do different algorithms
all the time - to get faster operation - - and as more transistors are
present to implement them - - I gave up trying to keep track of the
different concepts and ideas behind this.
It is this caching operation that freed up the Memory buss, It used to
be the total speed bottle neck but with Cache, this was removed totally.
Now it is not unusual to see memory buss usage levels as low as 10% on
the average. The larger the cache the less memory buss usage is
present. That allowed for on board video to share the main memory of
the processor. (note video is a memory hog - in terms of amount of time
or bandwidth that is needed) and only a small processor performance hit
But still when the information is not in the cache, The high speed main
memory is valuable. - - but the data is received into the cache in
chucks - - or blocks.
The memory chips are addressed in Row col fashion. Kind of an X,Y
matrix. The Row address is put to the chip first. As this has to be
decoded into the Row Select line and then the data comes out in the
columns. All Columns at once. Then Col or Y is decoded to select which
col is desired. Note this means that all col information is present at
the same time. - - so getting the next col information is very fast.
Very fast. It is simply a matter of selecting the next col desired. and
that can be done automatically in the chip as the data is clocked out.
So Very fast block transfers are possible from the memory chips. This is
what occurs and part of the reason - - bring in a block Not a single 64
bit integer at a time.
This high speed clocking of the data is what the relates to the transfer
rates you see for the memory chips. It is not the random access transfer
rates. But this block rate.
and as this has become more and more the standard. a lot work has been
done in this area. To interface better to the cache and cache
controllers in the micro.