Re: Assembly Language

From: Beth (BethStone21_at_hotmail.NOSPICEDHAM.com)
Date: 12/14/03


Date: Sun, 14 Dec 2003 08:59:17 -0000

ernobe wrote:
> Randy wrote:
> > Read "The Great Debate" on Webster:
> > http://webster.cs.ucr.edu.
> > Specifically,
> > http://webster.cs.ucr.edu/Page_asm/GreatDebate/GreatDebate.html
> > Cheers,
> > Randy Hyde
[ snip ]
> Rather than assume that a programmers job is to beat the machine by
means
> of compilers, while considering the rather dubious distinction
between
> "response time" and "overhead", it would be more reasonable to
suggest that
> it is all a question of the right algorithm for the job, and since
> algorithm improvement depends entirely on a more versatile
environment,
> nothing can beat assembly. "Response time" and "overhead" both refer
to
> the same thing when optimizing an algorithm.

I've looked over the "Great Debate" before and it seemed to exactly
make the "right algorithm for the job" point excellently...stating,
for example, that assembly language often allows for "medium level"
optimisations that are difficult or not in any way apparent from
solely using HLL compilers...this point _is_ made in the articles...

Also, a further point about the subtle but crucial difference between
"response time" and "throughput" (I don't recall Randy ever referring
to it as "overhead"...and his years in academia mean that he almost
always selects the most appropriate term, unlike myself who tends to
call everything "that what-you-mah-call-it thingy" ;) is also made in
the "Great Debate"...

I don't see why one set of valid arguments should be removed in order
to _duplicate_ a point already made, simply because you might not have
read that far to see that it's already there...this logic seems a tad
subjectively insular...

Sounds more like you didn't fully appreciate these points as
made...and have just decided that rather than make the multiple
intelligent points that Randy does throughout the articles, he should
turn it into what you would have written, which would just be a short
"right algorithm for the job" essay, labouring the sole point that you
personally have thought of, as if there was no other possible
reasons...

There's simply no grounds for removing the points about "response
time" and "throughput" from the articles because these are perfectly
valid points...the two are _NOT_ the same logically, in practice or
during optimisation...they are closely related, yes, and sometimes
hard to distinguish...but are entirely _different_ concepts...better
yet, even with improving hardware, the essential difference is a
timeless measure...

As an example of this, pick the most advanced PC hardware you can get
your hands on and start up Outlook Express - the world's slowest
loading program - and see how long it takes before you get a
"response" from it...with whatever it's doing that takes at least 10
or so seconds of devoted CPU time, I'm sure it's got fantastic
"throughput" in those activities because it leaves up a totally
unresponsive window hanging half-painted for all that time, completely
ignoring that the user is sitting there waiting...I did suspect that
it could be fetching Emails and that's what was taking so long...but
it's not even doing that - which would still be annoying but a
reasonable course of action, as the user _will_ undoubtedly want to
look at their Emails or they wouldn't have started up Outlook
Express - because if you boot it up while not connected, it _still_
takes just as long...in this case, "throughput" has clearly won out
over "response time"...

And, as you are concern over "algorithms" - rightly so - then you must
realise that a "response time" centric program is likely to have an
_entirely different algorithm_ to a "throughput" centric program...as
an example, a batch file is "throughput" friendly, while a Windows GUI
program is (or, rather, should be...but often isn't...such as Outlook
Express mentioned earlier) "response time" friendly...

The batch file has a series of commands that are executed one by one
until completion...no time is wasted checking user input because there
is none (well, okay, there's Ctrl + C to interrupt and abort a batch f
ile but one Hopes Windows does not sit in a polling loop for it but,
instead, initiates it only when the keyboard interrupt routines
specify that both keys are down)...therefore, "throughput" is
maximised because it goes directly about its work without wasting time
doing any other work...it's a straight out race to the finish...

The Windows GUI program, on the other hand, sits idle most of the
time...there's a large "message loop" that it goes around and around,
pausing for "messages" with each loop...when a message happens, it
spends time processing the message, informing the window via its
window procedure and then returning...an awful lot of time is consumed
in message processing (stupidly so...MS really need to do something to
improve this because, when not idle, this is where the majority of
time is wasted by Windows...it's not an excuse to say "but we'll be
waiting for user input so why hurry?", as Windows _cannot_ be sure
that this is the case in all programs...for instance, games and other
applications with "real-time" or continuous operation don't obey this
rule at all and suffer terribly because Windows presumes all
applications are the original spread*** GUI programs without
exception...not a particularly helpful presumption when there is
effectively only _one_ spread*** program - Excel - and Microsoft can
do what they always do and "cheat" to make their programs respond
better than others)...and much CPU time is "wasted" on not
particularly productive tasks...constantly redrawing window contents
while dragging a window (which can set off an awful lot of messages in
the system because windows are having parts of them obscured and
revealed as it happens...and, basically, the applications are
literally "machine gunned" with paint messages...potentially hundreds
passing in only a second or so of dragging a window around...note that
these messages and this ability to move the window is
_COSMETIC_...it's time _away_ from the application's true tasks and
eats into this time (some applications, in fact, simply just _stop_
doing what they are doing until the dragging period is over...quite a
reasonable strategy as the "machine gun" messages and potentially a
lot of "known unknowns" in other windows underneath your application's
window - who may or may not respond to their paint commands quickly
(and it's in _their_ hands to respond to these paint
messages...neither your program nor even Windows can "hurry them
along"...and they are, to quote Rumsfeld, "known unknowns" in that the
user is at perfect Liberty to put _any_ applications and _any_ number
of their windows underneath your window...it might be the bare desktop
or it might be a labyrinth of tiny windows all from different
applications, you just can't tell until it happens...hence, "known
unknowns"...you _know_ that you _don't know_ what they'll be while
programming and should presume that it'll be the worst case, in order
to ensure your application can cope with anything thrown at it)...

There's a lot of difference in these two approaches..."throughput" is
prioritised far lower than "response time" in these GUI applications
(but if it's just a word processor then that's a perfectly reasonable
trade because that's a "response time" kind of application
:)...Microsoft are on record - though Maxim doesn't believe it -
specifically pointing out that the standard Windows API presumes a
certain method of operation, which was why DirectX had to be "bolted
on to the side" of Windows for the types of application that I
referred to above as being essentially different because they don't
default to falling idle if the user is not providing input (mind you,
DirectX doesn't "cure" the problem by taking a different approach -
it's still event-driven programming via "windows" - it just does some
"cutting out of the middleman" so that things get done quicker and,
therefore, the impact to the performance is less noticable or damaging
on the application...as you talk about "right algorithm for the job",
well, DirectX isn't really that...it's the alternative approach of
hitting a problem with speed and technical improvements, Hoping to
reduce the problem sufficiently that it's not a problem any more :)...

And if we were to look at the source code for Rene's RosAsm (which is
a "response time" GUI program :) and Nasm's source code (which is a
"throughput" command line program :) then - ignoring issues like
"coding style" temporarily - there'd be a fundamental difference...

But, more importantly, the scope for "optimisation" is entirely
changed...radically so...I've many times spoken of how a different
approach - _based fundamentally_ on the essential difference between
"response time" rather than "throughput" - could bring about highly
significant changes...the "right algorithm", I would argue, for a
GUI-based editor / assembler (something like Rene's RosAsm) is
entirely different to a command line tool (all the others,
basically)...editing and assembly need no longer be separate "phases"
one after the other...much like someone realised that Word did not
need to wait for "permission" to start its "spelling and grammar
checking" process when it could initiate them during "idle" time (of
which there's an awful lot for any program that depends entirely on
user input because humans are
heavily-sedated-snail-carrying-a-very-heavy-anvil-on-its-shell paced
compared to the machines we use ;)...this change of algorithm vastly
improves "response time" to an impressive level...basically, it is
effectively "instant compilation" to the perspective of the
user...feedback can be provided _immediately_...the whole process is
_radically_ altered by this "right algorithm" optimisation change...in
fact, at first, it's likely programmers would need to take a while to
get used to the massive difference...

A world of difference can result from the "response time" vs.
"throughput" perspective...there's a massive amount of difference
between them in optimisation terms because of the potential or lack of
potential for "overlapping" tasks (a great example of this concept is
the "out of order execution" in the latest x86 chips...de-serialising
things in order to select the combinations with the minimum "wait
states" necessary...that's an _algorithmical_ change,
indeed...prioritising "throughput" when it makes no difference to its
"response" (note that the "out of order execution" is limited by this
secondary requirement...that is, it must react "as if" the
instructions were still strictly serialised...it can't start sending
out data before opening the connection to a modem...but, where it
makes no essential difference to the "perceived order" of a program,
the chip automatically juggles the instructions around to whatever
best maximised its "throughput")...this structure is the latest
"breakthrough" in CPU designs that, yet again, pushes them beyond
expectations..."just when you thought they couldn't think up something
new for the 30 year old x86 chips..." ;)...

If you think that Randy has "made a mistake" in including these very
valid and important considerations, then you've failed to appreciate
the extent and ramifications - yes, to your precious "right algorithm"
very much so, as a focus on one rather than the other, often calls for
a completely different algorithm to be used (and selecting which is
most important to an application is, indeed, an often ignored and
misunderstood thing but it makes all the difference in the world) - of
these points...they are non-trivial...

And the arguments are _timeless_...whatever the program or algorithm
or hardware, "response time versus throughput" and the structural
choices made to accomodate one or the other _always_ makes a
difference to "performance versus productivity"...that is _always_
true...

On the other hand, many are citing that with RISC architecture chips,
there's a case that humans really will have difficulty even matching a
compiler generated solution, let alone improving on it (possible...but
nowhere near as easily done as on the 30 year old clunky x86
architecture ;)...the "hit it with technology" solution is, in fact,
one of the _least_ "timeless" arugments...Randy is, indeed, right to
make that point about considering the "blind them with techno-science"
solution as a "last resort", not "only port of call"...

In fact, the "right algorithm for the job" is entirely dependent on
this "response time versus throughput" argument...they are not
independent at all...the choice made entirely decides what is this
"right algorithm" we should seek...Solitaire can use "GetMessage" but
other fast, high-action games cannot simply sit there doing nothing,
just because the user hasn't moved the mouse pointer...they are
"response time" applications and _that in itself_ has already decided
the category of "right algorithm", to which half of them (those
designed for bais toward the other) are _completely inappropriate_
choices...

You are being quite unreasonable here, really...you may request
(politely) that you feel Randy should devote more text space to
discussing "your favourite"...the correct algorithm choice is - and
always will be - _THE_ most important thing...so you're not wrong to
stress it and you might be valid in suggesting that it be more greatly
emphasised...but this is no "either..or" decision here...the "Great
Debate" is, in fact, specifically about trying to pin-point _ALL_ the
reasons for selecting assembly language...big and small...as long as
they are valid points and there's a _reason_ for them to be included
then they are there...

Anyway, when all is said and done, did you miss the part on those
webpages which states "if you have a well thought out article to
donate to the Great Debate, please send it in"? If you like - and it
is true that "right algorithm" is _THE_ most important consideration -
why not write this "missing article" you believe should be there?
Compile your thoughts and arguments, making the article really good
and then submit it to Randy for inclusion in the "Great Debate"...I'm
sure, if you produce a good article, Randy would be
overjoyed...looking at the website text, it sounds like he was
actually _Hoping_ that other people would "chip in" with some articles
where their expertise in a field means they can truely capture the
issue properly...

The opportunity is right there for you, in fact...and has been there
all along, actually _encouraged_ by Randy...if you think there's more
to be said, then write the article and say it...

I myself might have written something for the "Great Debate"...but,
with my writing style (not just the weird way I format it but also my
tendency to waffle on about off-topic things, which wouldn't be right
for what's meant to be _proper_ "formal" articles :) and the fact that
I basically thought that Randy'd done a very good job and couldn't
really think of any additional points (I'd, of course, probably write
it in an entirely different way to Randy...but, in the end, it would
only be a change of style, not any new or different arguments being
put forward :)...well, maybe you disagree, but I thought that what's
there is good enough for me that I couldn't think of anything useful
that I could add to what Randy's done...BUT, if you think you _do_
have additional points about "choosing the right algorithm" that Randy
hasn't covered, then it would be really great if you wrote that
article...

But, importantly, this isn't an "either..or" situation..._ADD_ your
arguments alongside Randy's arguments...this is NOT a "competition" or
"zero sum" situation at all, as - one presumes - you're _BOTH ON THE
SAME SIDE_ here in wanting to advocate the good reasons why assembly
language still matters..."teamwork" - not "friendly fire" - is what's
called for when you're on the same side as someone else...there's more
than enough webspace in this world for _everyone's_ articles on the
subject (even including diagrams!! Ooh, you can't beat a good diagram,
especially ASCII ones ;)...

Why have I suddenly gotten all "fire and brimstone" at your comments
(and the same thing for many of Rene's comments)? Simple...you're
shooting "friendly fire" at a colleague in the same army!!! Worse,
it's not a "horrible honest mistake" in these cases, you're doing it
_delibrately_!! No wonder assembly language is loosing the battle when
we have an "army" that uses each other for "target practice" and
aren't using "blanks"...by the time the "enemy" comes over the hill,
we've all killed each other except for one soldier remaining from the
ludicrously stupid "shoot out"...and, well, one man versus an army
that we probably had no chance beating, anyway, when we were on full
strength (well, it's "us versus hundreds of HLLs and all their hype,
money and muscle", after all ;)? Foregone conclusion...we've certainly
_lost_, without a doubt...amazing, really, because we did have
something on our side: superior firepower...we just might have scraped
through with some really good strategy and excellent "teamwork" and
"co-operation"...instead, we've blown each other all up before the
"enemy" has even come into sight on the horizon...

Indeed, "assembly is dead"; Not because it deserves to "die" but
because we're the worst army ever to have been formed in its supposed
"defence"...no "teamwork" whatsoever...it's a "last man standing"
all-out war, which bizarrely includes people _who are in the same
regiment as you_...every soldier a ever-so-slightly insane "shoot
anything that moves" renegade, who's played far too much "Space
Invaders" as a kid...a case where "programmer arrogance" (yes, come
on, admit it...we all have it :), really does NOT do anyone any great
favours whatsoever...

Right, back up to the "game type" menu...move the cursor off
"deathmatch" and down to "multiplayer co-operative"...then press ENTER
;)

Beth :)