Re: Time this code, please!




KiLVaiDeN wrote:
It's pretty funny to see you, Randall Hyde, the guy who supposedly
knows everything about assembly, ask people to test a program for
speed. The question is : Ain't you sufficiently qualified to time your
opcodes yourself ?

???
Sorry, I don't own a set of machines with every x86 CPU ever made.
Indeed, as I pointed out in my post, my current crop of machines are
all PIV machines. Whether I am personally qualified or not to run the
tests, I don't have the experimental environments to do such tests.


I then don't know the need for _you_ who has supposedly the knowledge
to calculate the complexity of those programs,

???
What on Earth are you talking about?
You do realize that "counting cycles" (which is what I assume you mean
when you say "calculate complexities") is sufficiently
non-deterministic across the x86 family that it's a waste of time to do
it, don't you? At least I'm smart enough to realize that such analysis
is a waste of time.


to ask people to run a
flawed "benchmark".

Really?
Feel free to point out the flaws.

It's totally unprofessional.

How would you know? Have you run the benchmark several times on a
single machine and verified that you get random results? I *have* run
the program multiple times. And the results were *very* consistent.
Sure, there are minor variations because of interrupts and
task-switching, but the end result is that the performance ratios
between the three different algorithms remain unchanged.

Results are random,

No, they are not. They are highly correlated, in fact.

since they are produced on machines that may be running other
applications, in a multi threaded environment.

You obviously didn't look at the code before making these statements.
Just so you understand (so you don't continue to stick your foot in
your mouth with such outrageous proclamations), each individual
conversion is timed and the result is added to the running sum for that
particular algorithm. An individual conversion is *unlikely* to be
interrupted by some other process. And for the ones that *do* get
interrupts, the interruption is (likely) distributed across all the
algorithms. Given that this program does four billion conversions (and
runs for 30 minutes, or more), the effects of other system processes
gets averaged over all the algorithms.

And, of course, you'll note in my original post that I specified the
benchmark to be run by itself on the system -- those who followed these
instructions will only have system processes affecting their results.
And most system processes do not consume an inordinate amount of time.



No offense meant, but I'm therefore quite skeptical about your own
capabilities...

As usual, your ignorance of what is really going on leads you to
believe one thing when, in fact, all you're doing is demonstrating your
own ignorance all over again.

And I won't even waste the little effort on my CPU to
run your flawed test; I have more respect for its lifetime than for
your "toy benchmarks".

Well, the good news is that I don't really need your results, do I?
Fortunately, there are enough people out there who are interested in
some scientific results to run the tests on their machine. And the
results are in -- the div by reciprocol algoritm fares better than the
div by sub algorithm; at least on every processor thus far except the
PIV. I seriously doubt your input would really change my knowledge of
this very much.

BTW, you might note that Terje Mathisen, the inventor of the the
reciprocol algorithm, was quite pleased with the results. I suspect you
don't know who Terje is, but rest assured that he was doing some pretty
impressive assembly language stuff back when you were just an infant.
Cheers,
Randy Hyde

.



Relevant Pages

  • Re: Where is behavior AI now?
    ... Even a simple robot needs short-term memory, ... Complex but fixed reaction algorithms can work fine in one ... current behavior set is not meeting the goal, (aka learning of some type) ... It gives us insights into alternatives to how machines can be structured. ...
    (comp.robotics.misc)
  • Re: I need some guideance regarding parallel processing
    ... searching, range searching, associative searching and all the other ... getting good algorithms before you try to optimize things. ... What standard have you implemented to network machines together? ... What message passing standard do you use? ...
    (comp.lang.fortran)
  • Re: thread memory size
    ... Lock-free algorithms typically perform significantly ... He has tested it on other machines with other architectures as well, ... With the current interest in massively parallel systems, a lot of the best people in the world on concurrency are working on improving performance of concurrent algorithms. ...
    (comp.os.linux.development.system)
  • Re: lzo2 shows insane speed gap
    ... crazy cache thrashing on some CPU models. ... Try running single command that is different on different machines under ... valgrind (callgrind) on these machines and see that at least number of instructions ... Lzo2 documentation says that there are a lot of algorithms implemented. ...
    (freebsd-hackers)