Re: About MMX\SSE2 PSADBW instruction ?

From: Ivan Korotkov (koroNOSPAMtkov2_at_ztelDOT.ru)
Date: 05/06/04


Date: Thu, 6 May 2004 17:52:08 +0000 (UTC)


> 2) the main size is 1024 char array....But i don't undertsand why the mod
> 4096 addresses of pvec1 and pvec2 could get performance penalties ?

Cache is thrashed. On your CPU, L2 should be 8-way associative, thus it can
contain up to 8 64-byte lines with the same 12 LSB's. Frequent moves between
such addresses can degrade performance because cache has to be reloaded
frequently.

Ivan



Relevant Pages

  • Page Colouring (was: 2.6.0 Huge pages not working as expected)
    ... >And what you are seeing is likely the fact that random placement is ... can get the same worst-case behaviour as with page colouring, ... >limiting your cache to a certain subset of the full result with the ... The most frequent case where random mapping gives better performance ...
    (Linux-Kernel)
  • IPv6 oops on ifup in latest BK
    ... Using ACPI for SMP configuration information ... Initializing CPU#0 ... CPU: Trace cache: 12K uops, ...
    (Linux-Kernel)
  • SCSI CDROM issue in kernels >= 2.6.14-rc3
    ... CPU: Trace cache: 12K uops, ... MEM window: disabled. ... SCSI device sda: 17928698 512-byte hdwr sectors ...
    (Linux-Kernel)
  • mptscsih: ioc1: attempting task abort! (sc=d6e8a980)
    ... CPU 2: Machine Check Exception: 0000000000000004 ... OEM ID: INTEL Product ID: Bridge CRB APIC at: 0xFEE00000 ... CPU: Trace cache: 12K uops, ... SCSI device sda: 287132440 512-byte hdwr sectors ...
    (Linux-Kernel)
  • Re: 2.6.16-rc6-mm2
    ... CPU: Trace cache: 12K uops, ... Calibrating delay using timer specific routine.. ... # ACPI Support ...
    (Linux-Kernel)