Re: Execution times running multiple instances of an application



Dr Ivan D. Reid wrote:
On Mon, 26 Nov 2007 12:55:56 -0500, deltaseq0 <deltaseq0@xxxxxxxxxx>
wrote in <JMD2j.18$gL1.13@xxxxxxxxxxxx>:
I'm doing some testing of an application using a Pentium 4 HT cpu. The
application runs under Cygwin using the XWin X Server and an xterm window.
It takes 1.9 hrs to complete using gcc 4.2.1 configured with thread model:
single. When I look at Windows Task Manager, it shows a cpu usage of 50%
with no other apps running at the time. From a previous optimization thread,
I tried running 2 instances of the application at the same time by opening
up another window under XWin. Task manager shows 100% cpu usage but instead
of completing the 2 applications in 2 hours, it takes 3.9 hours; slightly
more than twice the time if I had run the applications sequentially. Was
this to be expected?

I have seen instances where a programme has completely slowed down
due to its "hopping" from one CPU to another and (presumably) losing its
cache in the process -- although this may have been on a proper dual-core
rather than a P4. You can check if it makes a difference in Task Manager --
in the Processes tab, right-click on the programme and select "Set Affinity"
from the drop-down menu. Set one copy of the programme to run on CPU 0
and the other to run on CPU 1 and see if it runs any faster.

As OP is running on a single core HyperThreaded processor, where both
jobs use the same L2 cache, the most likely cache problem is contention
for cache resource. Swapping L1 is not as big a problem.
My primary use for HyperThreading is running cygwin gcc/gfortran
testsuite, where most of the time is spent on the extremely slow disk
operations.
Under Windows, you are lucky if raising the task manager meter from 50%
to 100% gets you a 20% increase in throughput with HT. It may approach
50% increased throughput with a recent linux, in cases where your
application runs threads without much increase in memory requirement.
.



Relevant Pages

  • SCSI CDROM issue in kernels >= 2.6.14-rc3
    ... CPU: Trace cache: 12K uops, ... MEM window: disabled. ... SCSI device sda: 17928698 512-byte hdwr sectors ...
    (Linux-Kernel)
  • Re: 2.6.18-rt1
    ... CPU: Trace cache: 12K uops, ... Intel machine check architecture supported. ... IO window: disabled. ...
    (Linux-Kernel)
  • REGRESSION: the new i386 timer code fails to sync CPUs
    ... CPU: Trace cache: 12K uops, ... Intel machine check architecture supported. ... IO window: disabled. ...
    (Linux-Kernel)
  • Re: 7.5.3 NT ODBC fails
    ... CPU at 100% for D3vme, ... When you refer to "Task Manager shows CPU at 100% for D3VME", ... stating that the PERFORMANCE window of Task Manager is at 100% on the ... Also, remember that with D3 7.5.3 NT, ODBC starts up with VME rather ...
    (comp.databases.pick)
  • Re: XP Peformance
    ... the cpu was once again clogged by a revived app. ... Services and stop the Print Spooler. ... Check task manager, ... >> Application window in WTM its associated .exe was still present under ...
    (microsoft.public.windowsxp.perform_maintain)