Re: Threading NOT working as expected



On Sat, 01 Mar 2008 17:04:08 +0100, Peter J. Holzer wrote:

On 2008-02-26 16:39, Ted Zlatanov <tzz@xxxxxxxxxxxx> wrote:
On Mon, 25 Feb 2008 12:05:09 -0800 (PST) Ted <r.ted.byers@xxxxxxxxxx>
wrote:
[number of threads should equal number of CPU cores]

This is incorrect. A modern single processor will perform well in a
multithreaded application. Just because a single thread will run
doesn't mean a single thread is doing work at any time.

Most of the time in a modern system is spent waiting for I/O and memory
access. There are some very special cases where the CPU is actually
tied up while the application runs, but memory and disk speeds have
fallen far behind CPU speeds so the CPU will usually be waiting for
something to happen. This is why modern CPUs have ridiculously large
L1 and L2 caches and many prefetching optimizations.

In a multithreaded setup, the CPU has a chance to run several threads
while memory and I/O fetches are happening.

Multithreading won't help for memory fetches. Firstly because CPUs don't
have a way to inform the OS of a slow memory access, and secondly
because the overhead of switching to a different thread would be much
too high for such a (relatively) short wait. There is one exception:
So-called multi-threading CPUs can keep the state of a fixed (and
usually low) number of thrads on the CPU and switch between them. But
these are really just multi-core CPUs which share some of their units.

You are completely right about I/O of course.

Not even. If all those I/Os are going to the same disk, you run the risk
of thrashing, and overall performance goes down instead of up.

Exactly the same behaviour can be seen with processes. Suppose you have a
bunch of files, together much larger than available memory. These files
are input to a program that handles one file and writes another output
file. You can do either:

for f in *; do program "$f" "$f.out"&; done; wait

or

for f in *; do program "$f" "$f.out"; done;

If the program is I/O bound, I expect the second version to be faster
than the first, although it depends on a lot of things.

So think, design, and profile, profile, profile.

M4

.



Relevant Pages

  • Re: It is almost certain now, INTEL will have 64bit x86 !!
    ... >>between this an 8400 which was that you had CPU boards, memory ... > if the MaxCPU boards had memory) number of CPUs. ... I/O bridge x 16! ... In the F15K these can be used for I/O or 2 CPU MAXCPU cards ...
    (comp.os.vms)
  • Re: memory interpretation help
    ... the last column in vmstat output: ... wa - CPU idle time during which the system had outstanding disk/NFS I/O ... > virtual memory is in the system. ...
    (comp.unix.aix)
  • Re: Splash Screens , how could something so basic still be hard?
    ... processing triggered by socket I/O. ... The problem with writing "throughput is limited by CPU performance" is that you can turn an i/o bound problem into a CPU bound problem simply by adding unnecessary overhead to the implementation, creating a CPU bottleneck when there wasn't any need to have one. ... using IOCP can benefit, due to efficiently managing the threads ... A single thread can be sufficient to handle hundreds ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Stressing an Solaris 10 system
    ... Stressing memory is not the same as stressing the CPUor I/Os. ... When several processes compete for CPU cycles, ... access your disks will reduce I/O availability for your ...
    (comp.unix.solaris)
  • Re: How does UNIX determine percentage of CPU used by a process
    ... the processes are CPU bound (not doing I/O and all ready to run i.e. their ... code and data are in memory), then each should get an equal share. ... all the CPU available. ... if there are many busy processes they all should show up as using ...
    (comp.sys.hp.hpux)