Code performance, rdtsc and code alignment



If I run the code below (which is based on an Intel performance
measuring note) the times for each of the three inter-rdtsc sequences
is almost always 122 cycles. However, if the align directive is
removed, sometimes the cycle counts vary markedly.

Anyone else seen this effect?

--
(Reposting here to avoid the moderator delay on another group.)
James

An example of running the code without the align directive:

Calibration times 122, 122, 122
Calibration times 122, 213, 122
Calibration times 122, 193, 122
Calibration times 122, 213, 122
Calibration times 122, 234, 122
Calibration times 122, 193, 122
Calibration times 122, 193, 122
Calibration times 122, 248, 122
Calibration times 122, 193, 122
Calibration times 122, 213, 122

With the align directive:

Calibration times 122, 122, 122
Calibration times 122, 122, 122
Calibration times 122, 122, 122
Calibration times 122, 122, 122
Calibration times 122, 122, 122
Calibration times 122, 122, 122
Calibration times 122, 122, 122
Calibration times 122, 122, 122
Calibration times 122, 122, 122
Calibration times 122, 122, 122


;;;;;;;;
;Calibrate the time stamp counter measurement

align 32
tsc_calibrate:
mov eax, [subtime_1] ;Preload cache
xor eax, eax
cpuid
rdtsc
mov [subtime_1], eax
xor eax, eax
cpuid
rdtsc
sub eax, [subtime_1]
mov [subtime_1], eax

mov eax, [subtime_2] ;Preload cache
xor eax, eax
cpuid
rdtsc
mov [subtime_2], eax
xor eax, eax
cpuid
rdtsc
sub eax, [subtime_2]
mov [subtime_2], eax

mov eax, [subtime_3] ;Preload cache
xor eax, eax
cpuid
rdtsc
mov [subtime_3], eax
xor eax, eax
cpuid
rdtsc
sub eax, [subtime_3]
mov [subtime_3], eax

ret
.