Re: Why do we need executables in certain formats ?

From: Scott Moore (spamtrap_at_crayne.org)
Date: 02/17/05


Date: Thu, 17 Feb 2005 18:59:12 +0000 (UTC)

WahJava wrote:

> Hi devs,
> Can anybody explain me why do we need executables in certain formats ?
> Why not plain binary (.com) file can't be used for execution ? How do
> these files are loaded in memory ? How jump locations are resolved @
> runtime ?
>
> I know these questions are answered at university level ? But I'm too
> far from those.
>
> Thanx in advance,
>
> Ashish Shukla alias Wah Java !!
>

There is nothing wrong with plain binaries. The original rationale for
complex binary formats was that programs need to be relocated, and
perhaps linked with libraries. The need to relocate a program is
entirely obsolete. Modern virtual memory processors can locate to
any standard address, which on most machines is the next page after
the zero page (so that zero address references will cause an error).

The need to link with libraries is more current, but this need, commonly
referred to as "dynamic linking and loading" has caused huge problems
with cross dependencies in Windows systems. Programs can have their
..DLL files changed out from under them, and fail because the program
has a hidden problem with the new .DLL. This has caused many software
makers to force the global update of all .DLLs required by the program
being installed to the current version, which then can break older
programs that were installed using the old .DLL files. What .DLL
files do is raise the possibility that a program can be run with a
series of .DLL combinations that are exponential, and completely
beyond anyones ability to test, or plan for.

The main use of dynamic linking is to "save memory", by allowing
DLLs to be shared between programs, and between different invocations
of the same program. But memory is not only cheap and plentiful,
compared to the days when .DLL was designed, but virtual memory
makes it largely irrelevant how large the memory for a particular
program is, since the working set is organized only around active
pages. Virtual memory can also allow different invocations of the
same program to share their binaries, by mapping the same code
page into multiple processes. Ironically, .DLL techniques work
AGAINST that, as I will explain.

What .DLL *DOES* do is unnecessarily complicate virtual memory loading
and sharing. Dynamic linking and loading requires that the image for
a program be modified. The program is modified to fit at the given
address, and the links to used .DLLs are modified to point to their
actual locations in memory. Because there now exists a "customized"
version of the program, it is no longer a "virtual" image of its
disk store, nor can those working pages be shared with multiple
invocations of the same program. Windows gets around these problems
by not relocating the image at all, and routing all .DLL references
via an "indirect jump" table embedded in the program file. This
allows only the pages containing the jump table to have the
per process copy aspect. The price of the scheme is that each
..DLL linkage jump/call needs to be an indirect address.

Sadly, Unix implementations, apparently feeling envy of not having
the *WORST* feature of Windows, imported Dynamic Linking into that
system, instead of imitating features of Windows that were actually
useful, so now all modern operating systems perform this hack.
Many serious application programmers have elected to get off this
train by "hard linking" libraries permanently into their programs,
entirely negating the .DLL system, and the need for complex
executables.

In the virtual memory versions of the IBM 360 OS, back in 1960s,
had "hard" binary images, and so were dramatically simple and
efficient implementations. When a program was "loaded", it was
simply marked as a running program. Since each page of the binary
on disk was always an exact image of the in memory store, the
program itself would request only the exact pages of the program
that were needed, it would literally "fault" itself into an
efficient working set. Because none of the program was allowed to
be modified, all invocations of the program automatically shared
the same program pages. A process (running program) was literally
the working set of its read only binary image pages plus a series
of variable pages that again, the program itself requested.

In short, there is nothing wrong with a flat, binary image. It is
even possible to embed a simple signature in the binary image so
that it can be verified that the image is not a non-executable file,
or from the wrong CPU (the program can jump over the signature).

What the proliferation of executable formats has more to do with
is that the kids who graduated computer science courses in the 1970s,
and built the "modern" systems we use today, thought they were
to smart to go back and read 1960s operating systems books,
and proceeded to make all the same mistakes the mainframe designers
made in the 1950's, which are all enshrined in these bloated,
vastly over complex, buggy and insecure operating systems we have
to use today. There is nothing natural or necessary about the
ridiculous and overcomplicated executable formats in current use,
and you are right to question them.

-- 
Samiam is Scott A. Moore
Personal web site: http:/www.moorecad.com/scott
My electronics engineering consulting site: http://www.moorecad.com
ISO 7185 Standard Pascal web site: http://www.moorecad.com/standardpascal
Classic Basic Games web site: http://www.moorecad.com/classicbasic
The IP Pascal web site, a high performance, highly portable ISO 7185 Pascal
compiler system: http://www.moorecad.com/ippas
Good does not always win. But good is more patient.


Relevant Pages

  • Re: DLL load issue
    ... processes need/load a DLL, the DLL is actually load one copy into physical ... Multiple copies of the virtual memory pages of the non-static ... Addresses are relative to a base register. ...
    (microsoft.public.vc.language)
  • Re: DLL load issue
    ... processes need/load a DLL, the DLL is actually load one copy into physical ... Multiple copies of the virtual memory pages of the non-static ... Addresses are relative to a base register. ...
    (microsoft.public.vc.language)
  • Re: .Net 2.0 app behavior on 64-bit machine.
    ... their DLL can address the 4GB virtual memory ... I would say it's unlikley their 32 bit dll is addressing 4GB of virtual ...
    (microsoft.public.dotnet.framework)
  • Re: C++ DLL Side Of a VB Call-back function Problem
    ... Nano schrieb im Beitrag ... found that when I implement a call to a C++ DLL which calls back a VB ... When a VB application gets started, the executable image ... will change its virtual memory address for the whole time the image is ...
    (microsoft.public.vb.winapi)
  • Re: DLL Injection from kernel-land
    ... 3- program catches event and asks driver information about process ... 5- DLL walks through PE and patches function addresses ... write in virtual memory some code that loads the DLL of interest ...
    (microsoft.public.win32.programmer.kernel)