Re: ntdll.dll shown to take 30% in profiler



I finally got to this problem, but so far I wasn't able to get it right.
I attempted to implement your suggestion of recycling threads instead of
creating and destroying them all the time.
In the original design the threads were created in suspended state. Then a
loop called Resume for each thread. The next loop called Thread.WaitFor and
then Free for each thread. The thread Execute method just performed the
calculation. This design woked, but apparently suffered the slowdown that I
was trying to solve.
In my first attempt of improvement I just skipped the create/free part for
the threads. I created all the threads on startup. The calculation was
performed once for each thread, but the next time I called Resume, nothing
happened. Apparently, after a thread finished it would no longer go to the
start of Execute.
My next attemp was to put an infinite loop in Execute(until program
termination) and use TEvent to signal that the caclulation finished.
//Execute
while true{not Terminated} do begin
DoCalculation;
Event.SetEvent;
Suspend;
end;

The application thread would reset the events and call Resume for each
thread. The next loop called Thread.Event.WaitFor for each thread.
Unfortunately the application thread waited forever(I have no idea why).

What is the right way of doing this?


"Leo" <nospam@xxxxxxxxxx> wrote in message
news:465fcf2f@xxxxxxxxxxxxxxxxxxxxxxxxx
Thanks all,
I will try this and see if it helps.
Also, how do I set thread affinity?

"Eric Grange" <egrangeNO@xxxxxxxxxxxxxxx> wrote in message
news:465e7b37$1@xxxxxxxxxxxxxxxxxxxxxxxxx
I create a number of threads equal to the number of physical
cores and give each thread a part of the calculation.
[...]
The calculations can sometimes be long. In such cases the time
taken by ntdll.dll is minimal. In other situations the
calculations are short and there can be hundreds each second.

Thread creation is quite a slow operation, and given the symptoms you
describe, I think like others that your ntdll time is likely it.

The most probable reason you wouldn't see the call point is that it could
be thread initialization stuff privy to the kernel, happening before the
delphi code itself can run in that thread, or thread cleanup, happening
after delphi code has run, in both cases, the actual callpoint would be
something in the NT kernel (which SamplingProfiler doesn't track).

Will using the same thread objects rather
than creating and destroying them all the time help?

It's preferable to create your worker threads at startup, and never
destroy them, just put them to sleep or wait of some signal when they're
done.
If you're "the" major user process running on the machine, you should
probably also manually adjust the thread affinities so that the kernel
won't flip-flop them between cores in an hopeless attempt at load
balancing.

Eric




.



Relevant Pages

  • Re: for loop faster than vectorized
    ... ML tries best to look at code and optimise it for you, ... It might throw off the internal MATLAB 'tweaker' and you would ... Say do a rand inside the loop: ... when using small vectors and running the calculation more ...
    (comp.soft-sys.matlab)
  • Re: Multithreading / Scalability
    ... By pulling out the result-assignment of the inner loop, ... amount of cache invalidations are reduced greatly. ... calculation, *instead of* making the switch to summing into a local ... private void multithreaded_calculation ...
    (comp.lang.java.programmer)
  • Re: Convert string to code
    ... It contains dependent values, ... Tips for Access users - http://allenbrowne.com/tips.html ... In other words, loop through the records, look at the calculation, do the ...
    (microsoft.public.access.modulesdaovba)
  • Re: where is the error?
    ... That said, in the code you posted, there is no variable that actually depends on the iteration of the loop. ... The inputs to the "alfa" calculation don't change, and so the inputs for "valore_futuro" don't change either. ... directly related to reproducing the problem, but it at the same time is ready to be compiled as-is, without any additional work on the part of someone trying to compile it. ...
    (microsoft.public.dotnet.languages.csharp)