Re: how do you start learning assembly language

På Wed, 16 Jan 2008 00:48:02 +0100, skrev robertwessel2@xxxxxxxxx <robertwessel2@xxxxxxxxx>:

On Jan 15, 7:36 am, //\\\\o//\\\\annabee <w...@xxxxxxxx> wrote:
På Mon, 14 Jan 2008 15:56:06 +0100, skrev Herbert Kleebauer
> RVA is a term from Windows PE files. The CPU knows nothing about a RVA.
> The CPU only knows a virtual, a linear and a physical address.

Ok. so windows does this as a simplification (for it), it pushes the work
down to the linker and loader, so that when the app runs it runs in
a easy to setup (for os coders) virtual environment?

Actually it's pretty universal. Almost all executable formats end up
with one or more "segments" which need to be relocated relative to
each other, and the load point, at load time. Which, of course,
implies fixing up any references.

Before relocating loaders became common, programs were linked to run
at a particular address, and the loader did little more than load the
image at that address (or refuse if it could not). In many older
systems, before virtual memory, multi tasking was handled by
partitioning the one address space. So the OS might be in the first
128KB, and three partitions, of possibly different sizes, might follow
that. Without a relocating loader, you had to link the program
correctly for each partition in which you might want to run it.

Most executable formats still allow you to remove the required
relocation information (/FIXED in Windows, for example - which happens
to be the default for .EXEs).

> By both. If you link object modules, then the linker has to do the
> relocation.

Which is what? Is it close to what I thought? Does it need to
change every "Item" (pointer) in each and every image?
Does this happen by pages, or does it all happen at load time?

Typically the linker resolves many of the pointers at link time, but
many require further relocation at load time. The linker is also
responsible for gathering all the different object modules that go
into the one executable. The linker also usually considerably
simplifies the executable by combining similar types of segments from
the different object modules.

For example, if the CPU in question supports IP relative address, say,
perhaps for branches, the linker will likely resolve the entire branch
and link time when it combines the code segments, but a reference from
one segment to another will likely require a relocation entry in the
executable since the two segments will be loaded at the locations the
loader chooses at it's own whim.

Usually the fixups in an executable are rather less numerous and
simpler than the fixups in an object file. Usually all fixups in an
executable are as simple as "adjust word A to the load point of
segment B plus offset C." In many formats, the value stored in the
word needing relocation is the actual offset ("C"), and the relocation
information is simply "add the load point of segment B to the
following words in the image," and the loader just resolves each fixup
with a single add.

In object file, you tend to have all sorts of different types of
fixups, combine types, methods of specifying references, and whatnot,
most of which will go away by the time the linker is done. Further
most object module formats are far more complex than executable file
formats, and often contain much information and structure specific to
the languages that generated them. Even in Windows, where the COFF/PE
format is used for both objects and executables, the executable files
use a much smaller subset of the "features."

Most of the time it's better to consider the executable file format
specification as part of the OS, but *not* the object file format.
The OS has to be able to load the executables, of course, but only the
development tool chain needs to understand object files. Of course,
there is very commonly a "standard" object file format on a given
system, but that's really just a convention which allows you to
combine objects from different sources, and use the standard tools
(like the standard linker) to build them (which is a huge incentive to
stick with the "standard" format). But it's hardly a requirement.
Compilers and whatnot can generate executables directly, or use some
other object file format, and reasonably often do.

Late bindings (calls/references to DLLs) do complicate the picture a
bit, and move what was more traditionally a linker function to the
loader, but even those references tend to stay pretty simple ("paste
the address of DLL D, function E into this word").

Thank you for the reply.

I have a much fuller picture of it now, thanks to all of you.
I hope they never remove the "late binding" stuff, as it allow
for graceful backoff, when a DLL is missing.