Burying source in executables




Frank Kotler wrote:
>
> Okay, where are these UTF-8 source files? On disk? Embedded
> in the (perhaps not yet existing) executable?

Well, one way to do this (with separate processes or with multiple
threads in the same process) is to use memory-mapped files. The editor
has a read/write version of the file open while the assembler has a
read-only view of the file open.

Segments used to be really good for this sort of thing, too. But modern
OSes eliminated the possibility of using segments to share data along
these lines.


> ...
> > the IDE, by the way, is basically nothing more than a
> > "compiler driver" - just like GCC or your own HLA.EXE - but the
difference
> > is that it's all "graphical"...
>
> This would put processes in separate address spaces. This
> sounds like you're contemplating a "brick&brock" system.

Best for protection, so if they're doing that it's a good idea. As I
just pointed out, shared memory (between processes/address spaces) is
possible using memory-mapped files.

Memory-mapped files are also great for assemblers (especially
incremental ones) because you get efficient random-access to the file's
data. And, as I pointed out above, one process can be limited to
read-only access while another has read-write access.


>
> > that's what the "modular" thing I've
> > mentioned is all about...it _STILL_ is a "tool chain"...individual
> > utilities, called in sequence...the "trick" here is merely that -
borrowing
> > a style similar to "browser plug-ins" - they can be loaded up as
"modules"
>
> This is not reassuring! I'm typing this in the editor of my
> funky old Netscape version (still!), and I am currently at
> risk of losing my work. Browsing "certain websites"
> apparently loads "certain plug-ins", which apparently have
> "certain memory leaks"... resulting in "certain death",
> after an "uncertain time". Occasionally, it takes the entire
> X server down, more often Netscape just quit without
> warning, "poof". (there *is* an autosave feature, but either
> it doesn't work, or I'm doin' it wrong.)

You *always* of the possibility of data loss when using an editor. Keep
in mind that this thread began with my criticism of RosAsm that having
the *assembler* crash could cause a data loss in the editor. For
(logically) decoupled operations such as editing and assembly, this is
inexcusable. Of course, a user won't be happy if the *editor* loses
their data, but you have fewer problems if only one process can bring
the show down rather than several. And if LuxAsm uses exception
handling (as mentioned in previous posts), there are fewer chances for
data loss in the editor (due to programming error, anyway; can't stop a
power failure).


>
> ...
> > And the "integration" has nothing to do with any "monolithic"
> > approach...it's simply a case of "awareness" between the tools...as
noted,
> > they are implemented as DLLs...hence, one "module" can load up
another
> > "module" and access its "services"...
>
> But this sounds like you're talking about loading modules to
> the same address space.

Not necessarily. It depends on how the processes share things.


>
> ...
> > Although, this is a "bonus", not a necessity...you know, you can
use other
> > tools as replacements for the LuxAsm ones...
>
> This sounds like separate address spaces...

Maybe an architectural diagram would be in order now? I must admit that
I'm only following this in casual manner, so I'm confused too...



>
> ...
> > I suppose one way to try to understand the basics of my design:
Imagine HLA
> > but where "hlaparse" / "fasm" / "polink" and so forth were
implemented as
> > _DLL files_, rather than as an "EXEs"...and what happens is that
HLA.EXE
> > loads up HLAPARSE.DLL...instead of "supplying a command-line", it
simply
> > calls some "Assemble()" function in the HLAPARSE.DLL file (which
can take
> > some "pointer to source code" parameter and returns a "pointer to
an
> > ERROR_INFO structure" return value :)...
>
> "Pointer" sounds like you're talking about "same address
> space".

Not necessarily. Only that the data is mapped into all the processes
(e.g., memory mapped files). And when you do that, you have more
control over things (such as r/w access control). Further, when one
such process dies, the others don't go down with it. This is the
problem with RosAsm.

>
> I don't doubt that you can make these modules "decoupled",
> yet "mutually aware", but I'm not entirely clear *how*
> you're proposing to do it. Where do we sys_fork, and where
> do we sys_clone? Do modules communicate via pointers or
> pipes or (mmapped?) file handles? Or what? We *do* need to
> get this clarified before we get too far...

mmapped is the traditional way to do this. Shared memory is another way
(Linux support system V shared memory intrinsics, IIRC).


>
> FWIW, I don't think this is anything to be very concerned
> about. It's an issue we need to think about (fairly
> obviously...). We're thinkin' about it... So...

Other than one process going crazy and overwriting all of memory
(including the source data in memory), the main concern is having one
process die and bring down all the others, as RosAsm does.

>
> But LuxAsm (as I Imagine it) *is* vulnerable in a sense that
> Nasm or notepad (surely you meant emacs!) isn't. You
> mentioned that Nasm *did* lose work... you edited the source
> (from "good" to "bad"), and your "good" .com file vanished.

Completely different issue. You can't prevent the data from being
trashed in all possible circumstances. The important thing is to make
sure that using one process doesn't inadvertently contribute to data
loss in a different process/application/module.


> (if Nasm *does* crash - and it can be done - it doesn't do
> the "cleanup" and a .com file *is* left behind - presumably
> corrupted).

But as NASM should only have read access to the source file, it
shouldn't trash the source file.

> Imagine that Nasm included an editor, and
> imagine that it embedded the source in the executable as its
> "primary" storage.

Good question. And one of the reasons I've decided that embedding
source in EXEs is not a good idea. Plus, what do you do when the file
doesn't assemble? How do you save a source file under such conditions?
Sure, it can be done, but it introduces inconsistencies in the file
model.


> Now you edit your source, adding "in al,
> 345h"... How is Nasm going to make an executable out of
> that? Where's your source now?

My question too.

>
> Well, LuxAsm isn't going to work quite like that, but we
> *could* lose the users source if we didn't do something to
> "take care of it". Imagine an editor that just quit without
> asking "do you want to save your changes?". I don't think
> we're dumb enough to do that...

My guess is that the best solution is to use a "forked" model where
LuxAsm keeps a standard ASCII (okay, UTF-8) source file and works from
that, but merges the source file into the EXE for distribution
purposes.

Of course, subsitute appropriate Linux terminology for EXE.
Cheers,
Randy Hyde

.



Relevant Pages

  • Re: Great SWT Program
    ... But we're talking about a TEXT EDITOR. ... aren't that good with either a) perfect memory recall or b) getting ... You could if you wanted to complicate matters over-much. ... I consider any time spent flipping through manuals and struggling to ...
    (comp.lang.java.programmer)
  • Re: What Is Perfect Sight?
    ... Imagine whatever is easy to imagine. ... in almost all people memory or imagination of FINE PRINT is ... Then keep alternating that (practice flashing) and do all that sort of ... Bates had immense success doing the same thing with his patients. ...
    (sci.med.vision)
  • Re: Still more memory-wiping, was Re: Shorts
    ... I can't imagine what part of the procedure of putting a shoulder ... back into socket would fit either of those requirements for "pain ... the concept of hurting but having no memory of it. ... in knowing they won't remember their screams. ...
    (soc.motss)
  • Re: P/Invoke - TAPI - LINEDEVCAPS-structure!?
    ... > When you call TSPI_lineGetDevCaps and pass it an instance of the structure,> the function may tell you that the block of memory you passed is too small. ... the explicit fields declared as being part of the structure are> just the 'header' part of the memory layout of the information. ... The dwLineNameOffset field> of the structure tells you how many bytes into the structure the start of> the line name string is, and the dwLineNameSize field tells you how> many bytes of string data are stored there for the name. ... >> How do I have to imagine what this structure looks like ans how itīs ...
    (microsoft.public.dotnet.framework.compactframework)
  • Re: Eclipse 3.0 is huge
    ... It emulates Emacs in its editor. ... >> probably not a particularly good criterion on which to choose the right ... >I just started up JBuilder, with two projects, one of them largish. ... see much change in the memory image with those loaded up. ...
    (comp.lang.java.softwaretools)