Re: referring to segment offsets in read address mode



Brian Phillips wrote:
I had tucked this question into a stale thread at alt.lang.asm,

Stale? It was yesterday. You young people are so impatient! :)

I'll repeat my answer, just posted to ala, in case anyone here wants to correct my errors.

....

> I had to move the offset to _bss into DS for the function 3Fh,


This is more of a Masm question - Nasm doesn't use the "offset" keyword - but I can't imagine that moving "offset" anything into a segreg is right... although it may work... it's an "immediate"... I assume when you use ".data" and ".data?", Masm creates "_data" and "_bss" internally???

In an earlier post, you observe "the CPU doesn't really know about the bss section". Quite right. The assembler converts our names - "bss" - into numbers, but these are only in terms of "nineteen segments higher than text". We don't know until load-time what the actual number will be. Dos loads us at a segment of its choosing - presumably the first/lowest one that's free - the loader does "FIXUP"s to patch "mov ax, @data" to its final number. The CPU *does* use a segment register and an offset to form an address - even in pmode! - but it doesn't know or care whether cs and ds are the same or different - so long as segreg:offset finds the "right" code/data.

I usually do .com files for dos. Although Nasm will allow "section/segment .bss", etc. everything is in one 64k segment - .text first, followed by .data, followed by .bss. When dos loads a .com file, cs, ds, es, and ss are all set to the chosen segment. The first 256 bytes are "PSP" (Program Segment Prefix - a data area that dos creates and uses - we can use parts of it, too).. Your fictional "section .text" starts at offset 100h into the segment (no options on entrypoint), followed by "section .data", etc. Initial sp is at 0FFFEh - it started at 0, but dos - the loader - has pushed a zero. So if we "ret", we go to offset 0 in this segment - the beginning of PSP - which is CD 20 (hex) - int 20h - the (an) "exit to dos" interrupt.


When dos loads an .exe file, es and ds are set to PSP again, but this is not the same as cs, and not the same as "@data" (which is why we need to load ds(and es) at program start, before we can find our data). It is common to combine data and bss - a linker option, I think (different linkers give different executable sizes, sometimes). If they're combined, "first_data_var" and "first_bss_var" would have different offsets, if you've got separate segments, they'd both be at offset 0, but the segreg involved would have to be set - by you - "assume" does *not* do this - just tells Masm that it's been done - if you've got a variable in one segment, and "assume" another, Masm will correct the offset if it'll reach, I think(?).

We haven't discussed ss:sp much... In Nasm, you'd want a "segment stack stack"/ "resb ????". The first "stack" is just a name, and could be anything, the second "stack" is an "attribute", and tells Nasm to tell the linker this is where we want our stack. Masm may be doing this "behind your back", if you aren't doing it explicitly. The loader than loads ss and sp for us. There's an example in the Nasm manual that explicitly loads ss and sp, but this is *not* necessary, and I think it's a Bad Idea - I've seen a nicely aligned stack turn into a misaligned one - let dos do it!

So... the CPU *does* depend on segments, but counts on the programmer, assembler, linker, and loader to conspire to provide it with the right numbers.

If you "need" multiple segments, it's time to graduate to 32-bit code, IMO, but if you want to experiment, step though it with a debugger and observe what numbers the actors have conspired to produce...

Hope that "clarifies" more than it "confuses"...

Best,
Frank

.



Relevant Pages

  • Re: referring to segments other than DS - how?
    ... The CPU *does* use a segment register and an offset to form an address - even in pmode! ... When dos loads a .com file, cs, ds, es, and ss are all set to the chosen segment. ...
    (alt.lang.asm)
  • Re: Pointer?
    ... >>different for NASM than other assemblers? ... > segment value in cs, ds, es, ss, fs, and gs. ... > mov bp, hello ... > In the case of an .exe, dos loads your program into several segments, ...
    (alt.lang.asm)
  • Re: NASM - VC++ Linking Problems
    ... My "experiment" didn't have a "section" or "segment" directive, and still the symbol appeared in the .obj. ... In an output format where it's valid, Nasm should export a "global" symbol - no excuses! ... (I don't mean to knock Watcom - I haven't tried WASM, but the openwatcom package as a whole is quite nice - unless I'm mistaken, the Windows executable of Nasm at SF was built by Watcom - running on Linux!) ... knows about the name mangling scheme used by its "companion" compiler, ...
    (comp.lang.asm.x86)
  • Re: Review status (Re: [PATCH] LogFS take three)
    ... from offset into segment number are performed in 32bit? ... not the absolute device offset. ... since logfs_segment_write() returns signed, so essentially logfs is ...
    (Linux-Kernel)
  • Re: NASM - VC++ Linking Problems
    ... My "experiment" didn't have a "section" or "segment" directive, and still the symbol appeared in the .obj. ... In an output format where it's valid, Nasm should export a "global" symbol - no excuses! ... myfunc = VC_mangled_name ... If you find that Nasm is really neglecting to export your globals... ...
    (comp.lang.asm.x86)