Re: FastCode CPUID
From: Roelof Engelbrecht (roelof_at_NOSPAM.tca.net)
Date: 09/03/04
- Previous message: John Herbster: "Re: Fastcode RoundToEX B&V 0.3"
- In reply to: Dennis: "Re: FastCode CPUID"
- Next in thread: Dennis: "Re: FastCode CPUID"
- Reply: Dennis: "Re: FastCode CPUID"
- Reply: Dennis: "Re: FastCode CPUID"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Thu, 2 Sep 2004 17:02:41 -0500
"Dennis" <marianndkc@home3.gvdnet.dk> wrote:
> L1 cache size and L2 cache size is enough.
OK. I'm thinking about something like this as a global variable in
FastCodeCPUID, initialized in the initialization section:
type
TCPU = record
Vendor: TVendor;
EffFamily: Byte; //ExtendedFamily + Family
EffModel: Byte; //(ExtendedModel shl 4) + Model
CodeL1CacheSize, //KB
DataL1CacheSize, //KB
L2CacheSize: Integer; //KB
InstructionSupport: TInstructionSupport;
end;
I'll probably add CPU speed as well. It may come in handy for benchmarking
and other purposes. FastCode function can the just refer to CPU.L2CacheSize
to adapt to L2 cache size.
> The least thing I should do is to make sure there is a proper description
at
> the homepage of the targets and which processors mar to each target.
Agreed. I'll will look at it and give you my comments.
> I do not find the target names too bad, perhaps except Opteron.
>
> We should also remember that I actually benchmark on P4 1600 Northwood, P4
> 2800 Prescott, XP 2500+ Barton, Opteron 1400 and P3 1400 Celeron only.
These
> processors are the basic target set. Then I get Pentium M results from
> helpers - thanks. I need to get a Pentium M - Banias or Dothan.
I have a Pentium M Banias, so I can help out when needed.
> If we claim that a P4N winner is optimized for Xeon X we are actually
lying.
Not really if it is a Prestonia (DP) or Gallatin (MP) Xeon. The Northwood,
Prestonia, Gallatin processors are exactly the same, except for the L2 and
L3 cache sizes.
L2 cache:
Northwood Celeron: 128 KB
Northwood Mobile Celeron: 256 KB
Northwood Pentium 4: 512 KB
Northwood Mobile Pentium 4: 512 KB
Prestonia Xeon DP: 512 KB
Gallatin Pentium 4 EE: 512 KB
Gallatin Xeon MP: 512 KB
L3 cache:
Northwood Celeron: none
Northwood Mobile Celeron: none
Northwood Pentium 4: none
Northwood Mobile Pentium 4: none
Prestonia Xeon DP: none or 1 MB
Gallatin Pentium 4 EE: 2 MB
Gallatin Xeon MP: 1 MB, 2 MB, or 4 MB
So, if the code (and related data) fits in 128 KB L2 cache (the minimum
available on the P4 non-SSE3 architecture), all these processor will operate
esentially the same, because their "engines" are the same. The difference
comes in when L2 cache > 128 KB, L3 cache and/or main memory is accessed,
but you cannot really optimize for that because there are too many
permutations.
> I would like to improve/expand the set of machines I benchmark and
validate
> upon, but I am short on money ;-)
If the function stays within the minimum L2 cache size available on each
architecture, you only need to benchmark on one processor per architecture.
> Should the function pointer be named CompareTextFastcode, CompareTextFC or
> CompareText?
>
> I will update the library user guide / design guide when we decide it.
Leaving it CompareText is probably the easiest, because existing code will
compile without changes. However, if SysUtils (which contains the RTL
CompareText) is listed after FastcodeCompareTextUnit in the uses list, then
the SysUtils version will be called. You can always get past this by using
FastcodeCompareTextUnit.Fastcode but it is probably easier to just to ensure
that the FastCode libraries are listed after the RTL libraries in the uses
list. The original RTL CompareText wil still be accessible through
SysUtils.CompareText. My second choice would be to call it fcCompareText.
> > To make the function names a little shorter (and more distinct) you can
> > perhaps use "_fct" instead of FastCode, yielding function names such as
> > CompareText_fctP3 and CompareText_fctPM.
>
> I would prefer FC for Fastcode, but I like the long names and "hate"
> underscores ;-)
How about fcCompareTextP3 and fcCompareTextPM? We can also go to
fcCompareTextP4SSE3 instead of fcCompareTextP4_SSE3. I really don't care
that much...
> I am sure that the Delphi community is looking forward to have some more
> libraries :-) I need all the help I can get on building them. Send some to
> me as soon as you finish them. Put your name in them too.
Will we have individual libraries for each function, or will we ultimately
combine libraries by function, for example FastCodeMath for all the Math
stuff and FastCodeText for all the Text stuff?
Roelof
- Previous message: John Herbster: "Re: Fastcode RoundToEX B&V 0.3"
- In reply to: Dennis: "Re: FastCode CPUID"
- Next in thread: Dennis: "Re: FastCode CPUID"
- Reply: Dennis: "Re: FastCode CPUID"
- Reply: Dennis: "Re: FastCode CPUID"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|