Re: Teaching (and Learning) Assembly Language, Part 2
- From: "randyhyde@xxxxxxxxxxxxx" <randyhyde@xxxxxxxxxxxxx>
- Date: 30 Aug 2005 14:22:27 -0700
Part II: Tools
--------------
When learning assembly language, it's best to have a good set of
(software and other) tools to help streamline the educational process.
As for pedagogy, the type of tools that work best depend upon the
individual's needs when learning assembly language.
Category one individuals, those who've never before programmed or used
much in the way of software development tools, are the easiest to
satisify. Having few, if any, preconceived notions of how software
development takes place, they need to be taught almost everything.
Including how to edit source files (i.e., how to use an editor), how to
invoke the software development tools (e.g., how to use an IDE or a
command-line environment), how the software development process works
(e.g., the edit-compile-link-run-debug cycle), and so on. Generally,
they have few prejudices and (assuming they are willing participants)
are willing to try things out. About all they will expect is for
generic tools, like the editor, to behave somewhat like similar tools
(e.g., word processors) that they've used in the past. That is, as long
as the tools that are vaguely familiar to them work in the same way as
other Windows applications that they've used, they're going to be
relatively happy.
Category two individuals, those who've learned how to program using
some language other than assembly, are not a blank slate. Though they
have never programmed in assembly, they are quite familiar with the
software development cycle and they've used text (programming) editors.
Such individuals *are* coming in with some preconceived notions of how
software development should be done. For example, an individual using
VB or Delphi is going to ask why such facilities are not available to
assembly langauge programmers. Someone who has used a nice IDE like
Visual Studio is going to ask why most assembly language development
system operate from the command line. Someone who has used a nice
editor like Code Wright or UltraEdit32 is going to question the
limitations of a "source editor" provided with an assembler that forces
you to use that particular editor. Nevertheless, once the shock of "you
can't use the tools you used for Java/C/C++/VB/Delphi" wears off, they
can master the new tools and get down to the business of learning
assembly. Fortunately, *most* assemblers allow you to use existing
editors and IDEs to do assembly development; some are even configurable
to support development via an existing compiler, so often the category
two person doesn't even have to give up their favorite editor. Still,
most of them are going to have to learn a little bit about the command
line (which is a common annoyance). True, various IDEs are available
for most assemblers today. But the bottom line is that the person is
going to have to learn these new IDEs as well, so they aren't
completely spared. On the plus side, as cat2 people have never before
learned an assembler, they can choose *any* assembler that suits their
needs. This is a big advantage over cat3 and cat4 individuals.
Category three individuals, those who've learned assembly on a
different processor and are now learning assembly on the x86 (or vice
versa), come in with a *lot* of prejudices and preconceptions. Right
off the bat, they're going to look for an assembler that is as close as
possible to the assembler they used on other processors. A classic
example of this occurred when people transitioned from the 8085 and Z80
to the 8086 in the late 1970s and early 1980s. The syntax Intel
designed for 8086 assembly language was radically different from that
employed by most existing assemblers. The concept of segments, type
checking, semantics, and other advanced features in the Intel syntax
led to the development of "simplified" assemblers (like the CP/M
assembler and A86) that "avoided the red-tape directives" found in the
Intel assembler.
In truth, the features in the Intel assembler (and later, MASM, which
followed the Intel syntax) were quite powerful and advanced, but those
transitioning from the earlier processors didn't want to have to
relearn how to do assembly language, so they were happy when
"simplified" assemblers appeared that used a syntax and programming
paradigm that was closer to the assemblers they were used to using on
the 8086 and Z80 (and other assemblers at the time). Of course, the
instruction sets were completely different between the older processors
and the x86, but by limiting the new features in these simplified
assemblers, the "assembly programmers" could get up to speed faster
when these assemblers (as there was less to learn about, and most of
the things they already knew could be directly applied to these
simplified assemblers). Of course, the drawback is that these
programmers were unable to take advantage of the advanced facilities
that Intel Syntax assemblers provided.
Though the days of the 8085 and Z80 are pretty much long gone, there
still *are* some people transitioning from other CPUs (mostly embedded
CPUs) to the x86 on an occasional basis. Most of the time, the transfer
is a grudgingly one -- they're forced into doing it rather than by
choice. They want to spend as little time as possible learning the new
syntax and they want to apply their knowledge as soon as possible. If
they can find an assembler whose directives and general usage is
similar to the assembler(s) they've been using on other processors,
they're very happy. In extreme cases, you get people like Herbert who
write their own assemblers so that the instruction set even mimics the
CPU they used to work on (e.g., Herbert's 68K-like x86 assembler).
Ultimately, people in the third category will have to break down and
work with an x86 assembler, but they're prone to pick one that is as
close as possible to earlier assemblers they've used rather than go
with something that is radically different. Gas, NASM, FASM, and
A86/A386 are common assemblers that cat3 people tend to use, as these
assemblers use a syntax that is similar to assemblers for other CPUs.
Category four people are the toughest to please. They've already worked
in x86 assembly and are (at least) vaguely familiar with the syntax for
a particular assembler. Chances are pretty good they're going to stick
with the assembler they started with unless that assembler is not
available for some reason (no longer sold/distributed, not available on
a new OS of interest, or the original assembler doesn't support
features the individual now needs, such as new instructions). Cat4
individuals are going to want to use an assembler that is as close as
possible to the one they were using. They are rarely willing to relearn
the syntax unless absolutely necessary (e.g., a MASM programmer who
wants to work under Linux). The number of defections from NASM to FASM,
for example, is explained by the fact that NASM and FASM are *very*
similar in syntax. Ditto for MASM/TASM defections. Rarely, though,
would you see any of these programmers switching from their assembler
to Gas (unless, of course, they need to write assembly language on an
OS that doesn't support anything other than Gas).
It should be clear that no single tool is going to please everyone. The
cat3 and cat4 people pretty much guarantee this. So if you're going to
develop an assembler in order to teach people assembly language
programming, you really need to target which category (or categories)
you're going work with and write your assembler accordingly.
For example, if you're targeting cat4 people, and you want to have the
largest possible audience from this group, it would make sense to
create a clone of the MASM assembler. MASM, by far, has the largest
user base (especially those people who've programmed in assembly a
while back and might want to use it again). Writing a MASM-compatible
assembler that is portable across various OSes would be *very* popular
among the cat4 group. Of course, not everyone uses MASM, but the bottom
line is that NASM and FASM are relatively popular with the cat4 crowd
and they are already portable across many different operating systems;
so attempting to clone these assemblers wouldn't gain much. Another
alternative is to do as Herbert has done -- design an assembler whose
instruction set mimics that of another processor. Such a tool *might*
be interesting to other individuals coming from that same CPU (e.g.,
the 68K in Herbert's case). Of course, there are two problems with this
approach: (1) you've limited the market for your product to people who
are coming from that same CPU, and (2) some people may want to write
code that other x86 (or whatever) assembly programmers can figure out,
and mimicing some other CPU with the instruction mnemonics does not
make for readable x86 code.
For cat3 people, a generic or "universal" assembler is a good bet. Gas
is probably the best example here. Other than the actual instructions
themselves, Gas is relatively consistent from processor to processor.
Usually you'll use the same directives and pseudo-opcode on one
processor as another. There are some differences, but by and large the
(non-machine instruction) statements the assembler accepts don't vary
that much from one Gas version to the next. Unfortunately, Gas has (in
many people's opinion) such a *bad* syntax, that most people refuse to
go this route. Nonetheless, the popularity of Linux and dramatically
increased the popularity of the Gas assembler (which comes with every
copy of Linux and is the standard assembly syntax used in the kernel
itself). NASM and FASM are also popular choices for cat3 people (as
they are similar, in principle, to assemblers available on other CPUs).
The needs of people in categories one and two are quite different from
those in categories three and four. Beginners to assembly language
programming need to be able to get up to speed as rapidly as possible.
It should be easy for a beginner to write some simple assembly language
programs without having to learn a lot of arcane syntax (and without
being "spoon-fed" the code). This, for example, is one place where the
original Intel syntax (particularly when combined with segmentation)
failed. All those "red-tape" directives that CP/M-86 and A86 assembler
programmers talked about stood in the way of a beginner writing their
first assembly language programs. Unfortunately,
Intel-syntax-compatible assemblers, like MASM, did not support the
programming language concept of "restrictability" very well.
Restrictability means that a beginner can effectively use a subset of a
language when first learning the language and still be able to write
meaningful programs. Unfortunately, with assemblers such as MASM,
you've got to learn too much arcane syntax (like how to define and use
segments) before you can write even the simplist programs.
Of course, one way to satisfy beginners is to write a trivial (or
"toy") assembler. If you don't provide many features, the beginner
won't have to learn them and the whole educational process is a lot
shorter. Of course, the problem with this approach is that beginners
will quickly outgrow the "toy" assembler as their programming skills
improve. And then they're faced with living within the limitations of
the "toy" or spending the time to learn a more powerful assembler.
Restrictability allows the best of both worlds. By restricting one's
self to a subset of the complete language when first learning the
assembler, one can learn that subset much more rapidly than learning
the syntax for the entire language. Once the beginner outgrows the
restricted subset, they can learn *more* about the assembler and their
skills will grow to match the facilities provided by the assembler.
Assemblers like MASM and TASM generally do poorly in the
restrictability department. An assembler like HLA was designed with
restrictability in mind. FASM and NASM also do a pretty good job of
supporting restrictability, though their feature sets are a bit small
(especially NASM), so they almost achieve this by not providing an
extended set of features to grow into (certainly no on the scale of
MASM/TASM/HLA).
A good example of restrictability in action for cat1 individuals would
be the high-level control structures in assemblers like HLA, MASM, and
TASM. If you're teaching a beginner assembly language, who knows
nothing about programming at all, teaching them the high-level control
structures found in these assemblers (or their macro equivalents in
other assemblers) is almost a complete waste of time. They spend
considerable time learning these statements for very little benefit (as
they will soon be told that they're "not real assembly language and
assembly language programmers don't use such statements in 'pure'
assembly code.") But the fact that these statements are present in the
language doesn't mean that the beginner is forced to learn them. Cat1
individuals can skip learning about these statements and concentrate
solely on the basic machine instructions. This is restrictability in
action. Another example might be FPU, MMX, and SSE instructions.
Beginners generally don't need to learn about these parts of the
instruction set. They can stick with a subset of the integer
instructions for a long time. Less to learn generally means easier to
learn. If someone needs FPU, MMX, or SSE functionality, they can learn
all about those instructions when they need them.
Without question, the category two group is the largest of these four
groups. Remember, all those students forced to take an assembly course
in Universities have generally learned how to program in a HLL already,
but have never before seen assembly language. The number of students in
this situation far outnumbers all the other categories combined (by
several orders of magnitude, in fact). But even if we eliminate all the
students who are being forced to take the course, it's still the case
that category two is the largest group of the four. Most people
learning assembly for the first time have at least a small amount of
experience with some other HLL. If you want your tools to have the
widest appeal among beginners (or all assembly programmers, for that
matter), the cat2 crowd is the one to target. This is, for example, who
I've targetted with the HLA assembler.
Remember the cat3 and cat4 programmers? They want assembly development
tools that are as close as possible to the tools they're already using.
Cat2 programmers really aren't that much different; the main difference
is that cat2 individuals are using HLLs rather than assemblers. But
they still want tools that work the same way as their HLLs and they
want to be able to take advantage of their HLL programming experience
when learning assembly. This point should be fairly obvious, but it
took me several years to decide that this approach was in the students'
best interests when I was teaching assembly language courses in the
1989-2000 time frame. In the early 1990s (around '93, IIRC), Microsoft
created the product that would make life a whole lot easier for cat2
students: MASM v6. In particular, MASM v6 introduced the HLL-like
control structures that are common today in high-level assemblers such
as MASM, TASM, and HLA, and those HLL-like control structures are
commonly emulated with macros in most of the other assemblers. The
notion of a high-level assembler does for cat2 programmers what cat3
and cat4 programmers look for in an assembler -- it gives them an
assembler that uses a syntax, or programming paradigm if you will, that
is familiar to them based on their programmer experiences. The
difference between cat2 and cat3/4 is that the cat2 crowd's previous
programming experience was in a HLL.
Keep in mind that someone who is learning assembly language has a
limited amount of time available for this process. In a course, the
length of the term dictates the amount of time. For an individual
learning assembly language programming on their own, their attention
span dictates how much time is available. The job of someone providing
the tools for these people to learn with is to maximize the educational
efficiency. That is, to reduce the amount of time spent learning
assembler (or, more accurately, to maximimze the amount learned within
the time available). The design of various tools can have a big impact
on the education process. And the important thing is to note that no
tool is suitable in all cases. FASM and NASM, while they are great for
people coming from traditional assemblers on CPUs (cat3) or people
coming from MASM who want to work on different OSes with a somewhat
similar syntax (cat4), aren't particularly help for cat1 and cat2
people. Likewise, Gas is great for cat3 people who want to use the same
syntax across multiple OSes and CPUs. High-level assemblers, like HLA,
MASM, and TASM are great for cat2 people coming from HLLs as they get
to leverage their HLL while learning assembly language, working in a
more comfortable environment. HLA, in particular, supports the concept
of restrictability quite well (unlike MASM and TASM).
If you're someone who is learning (or teaching) assembly language, it's
a good idea to determine your (students') category and pick a tool
based on that category and what you want to learn. If you're writing a
tool, it's a good idea to choose which category you'd like to support,
and aim your tool specifically at that category.
For example, if you're targetting cat1 programmers, you want a complete
system that is easy to learn (not a lot of extra tools that are
confusing in scope-something that lets programmers edit, assemble,
link, run, and debug their code with a minimum of fuss; though you do
want a tool that has the ability to grow with the user). Remember,
every minute spent learning the tool is one minute less for learning
assembly language. People have a limited attention span and you don't
want them playing around with tool configurations or other toys when
they could be learning assembly language. Eventually, they'll get tired
of the educational experience and if they've spent a whole lot of time
playing with tools like object code dumpers, cross-references, and
other tools unrelated to learning assembly language, then the amount of
assembly language knowledge they've gained will be reduced by a like
amount. Better to concentrate only on teaching those portions of the
tools necessary for learning assembly language and nothing else.
Likewise, if you're teaching cat2 individuals, you can assume certain
prerequisite material (they'll already be familiar with the general
software processes, basic programming idioms, etc.). You don't want to
waste time teaching people what conditional statements (IFs) are used
for, why you need loops, the purpose of basic data structures (e.g.,
arrays). Instead, you want to quickly concentrate on teaching them how
to do all these things they already know how to do in a HLL in assembly
language. You want to use tools that let them continue to use their
favorite editor(s) or development environments (if this is possible).
And, in general, you want to maximize their time learning assembly,
versus learning tools.
For cat3 individuals (those who know assembly on a different processor)
will want to spend as little time as possible learning the use of
tools. They've got a job to do and they're mainly interested in getting
up to speed on the assembler's syntax and the use of the basic machine
instructions as rapidly as possible so they can complete whatever task
they've got to do. Rarely would such individuals want to spend an
inordinate amount of time learning the ins and outs of a particular
tool set.
Cat4 individuals are probably more willing to spend time with their
tools because they already know (some) x86 assembly language. Still,
they are interested in getting up to speed as quickly as possible in
the new environment.
Bottom line is that if you're developing a tool, be sure to pick a
target audience and target the tool accordingly. If you're interested
in learning or teaching assembly language, determine your category (or
the category of your students) and pick appropriate tools for the job.
In part three, I'll start discussing in detail how to write books and
tutorials for the different categories of programmers.
Cheers,
Randy Hyde
.
- Follow-Ups:
- References:
- Teaching (and Learning) Assembly Language, Part I
- From: randyhyde@xxxxxxxxxxxxx
- Teaching (and Learning) Assembly Language, Part I
- Prev by Date: Re: Teaching (and Learning) Assembly Language, Part I
- Next by Date: Re: need help in adding numbers and displaying them
- Previous by thread: Re: Teaching (and Learning) Assembly Language, Part I
- Next by thread: Re: Teaching (and Learning) Assembly Language, Part 2
- Index(es):
Relevant Pages
|