Re: Buffers in Assembly (NASM)



bwaichu@xxxxxxxxx wrote:
I'm trying to better understand data structures in assembly. I know I
can create a zero filled buffer in the bss section in NASM with this:

buffer: times 64 db 0

Hi Brian,

Well... In -f obj (OMF) output format, Nasm will tolerate this - since it doesn't know what ".bss" means. In (all?) other formats, this will generate a warning (64 of 'em, actually) "attempt to initialize memory in a nobits section: ignored". In an uninitialized (.bss) section, there's "nothing there", so it would be "conceptually impossible" for Nasm to zero it. That would be the right way to do it in an initialized (.data) section.

In executable formats *other* than dos .com files, the loader - not Nasm - zeros the .bss section. Some "standard"... the "Intel ABI" I think, requires it... I think... I'm not sure whether we're "supposed to" count on this or not. Seems to me like an "error" to ASSume anything about memory that we haven't explicitly initialized.

In a dos .com file, the .bss section is truly uninitialized - whatever "garbage" was there, stays there. We can demonstrate this:

org 100h

section .bss
answer resb 1

section .text
cmp byte [answer], 42
jz printyes
mov al, 'N'
int 29h
jmp common
printyes:
mov al, 'Y'
int 29h
common:
mov byte [answer], 42
ret

First time you run this, it (probably) will print 'N'. run it again - without doing anything to mess with memory - and it'll print 'Y'. (untested, but that's the way I remember it)

I'm a little unclear what Nasm does with "resb" (and friends) in -f obj output format. It *seems* that if your uninitialized data is collected at the end of your source, it does not become part of the on-disk file. If initialized data follows it, Nasm silently zero-fills it, and it *does* add to the file size. It may depend on the linker, as well.

[if anyone knows/remembers details of the OMF format, the nasm-development team is looking for "verification" of a proposed patch]

And I know I can create a buffer on the stack like this:

sub esp, 64
mov ebx, esp ;save the start point of the buffer

But how do I zero out the buffer on the stack? In C, I would just do
something like:

char buffer[64] = {0};

What's the equivalent in assembly using NASM?

Call memset? That's the way C does it...

I'd probably do, depending on mood...

BUFSIZ equ 64 ; damn well *better* be a multiple of 4!!!

....
sub esp, BUFSIZ
mov ebx, esp ; you said to...
mov ecx, BUFSIZ - 4
xor eax, eax
..my_memset:
mov [esp + ecx], eax
sub ecx, byte 4
jns .my_memset
....

I'm not saying that's a "good" way to do it - or even right (untested... I should know better...), but I'd probably do something like that... Possibly something involving "rep stosd"...

How would gcc handle it? Depends on version - and switches - I'm sure. Here's *one* way gcc does it:

(suppose I'd better post the source... this is just "junk" that I added the "={0}" to...)

#include <stdio.h>
#include <unistd.h>

int main()
{
char name[80] = {0};
int name_len;
printf ("Please tell me your name? ");
/* fflush(stdout); */
name_len = read (0, name, 79);
name[name_len - 1] = 0;
/* gets (name); */ /* this *does* flush stdout! */

/* we know better than to use gets(), right? */

printf ("Hello, %s! Welcome to Linux Assembly!\n", name);
return 0;
}

Here's what "objdump -d" thinks of "main":


080483d0 <main>:
80483d0: 8d 4c 24 04 lea 0x4(%esp),%ecx
80483d4: 83 e4 f0 and $0xfffffff0,%esp
80483d7: ff 71 fc pushl -0x4(%ecx)

Hmmm... Align the stack and "re-push" the return address?

80483da: 55 push %ebp
80483db: 89 e5 mov %esp,%ebp
80483dd: 83 ec 68 sub $0x68,%esp

Note that this is more than the 0x50 bytes in our buffer.

80483e0: 89 5d fc mov %ebx,-0x4(%ebp)

Save caller's reg? Instead of "push ebx"?


80483e3: 8d 5d a8 lea -0x58(%ebp),%ebx

Address of our buffer?

80483e6: 89 4d f8 mov %ecx,-0x8(%ebp)

Lemme see... ecx was initial esp + 4... address of "argc"??? WTF???

80483e9: 89 1c 24 mov %ebx,(%esp)
80483ec: c7 44 24 08 50 00 00 movl $0x50,0x8(%esp)
80483f3: 00
80483f4: c7 44 24 04 00 00 00 movl $0x0,0x4(%esp)
80483fb: 00

Here, I believe we're "pushing without push" the parameters for memset onto the stack.

80483fc: e8 c7 fe ff ff call 80482c8 <memset@plt>

.... and that's how we zero the buffer...

8048401: c7 04 24 54 85 04 08 movl $0x8048554,(%esp)
8048408: e8 eb fe ff ff call 80482f8 <printf@plt>

"not-push" the (static) address of "please tell me..." and print it

804840d: 89 5c 24 04 mov %ebx,0x4(%esp)
8048411: c7 44 24 08 4f 00 00 movl $0x4f,0x8(%esp)
8048418: 00
8048419: c7 04 24 00 00 00 00 movl $0x0,(%esp)
8048420: e8 c3 fe ff ff call 80482e8 <read@plt>

"not-push" our buffer address (in ebx), length (-1), and "stdin", call read...

8048425: c6 44 05 a7 00 movb $0x0,-0x59(%ebp,%eax,1)

Zero-terminate the string.

804842a: 89 5c 24 04 mov %ebx,0x4(%esp)
804842e: c7 04 24 70 85 04 08 movl $0x8048570,(%esp)
8048435: e8 be fe ff ff call 80482f8 <printf@plt>

"not-push" our buffer, format string, call printf...

804843a: 8b 4d f8 mov -0x8(%ebp),%ecx

Get that "address of argc" back into ecx...

804843d: 31 c0 xor %eax,%eax

"return 0".

804843f: 8b 5d fc mov -0x4(%ebp),%ebx

Restore caller's reg?

8048442: 89 ec mov %ebp,%esp

Get back our "aligned stack"...

8048444: 5d pop %ebp

Restore caller's reg.

8048445: 8d 61 fc lea -0x4(%ecx),%esp

Restore our "original stack".

8048448: c3 ret

Whew!

Why? 'Cause that's the way the compiler writer wanted it, I guess. Maybe I should have looked at something with *just* that buffer in it...

Best,
Frank

.



Relevant Pages

  • Re: which book to start with...?
    ... mov eax, 4 ... just installed nasm 16 bit and 32 bit bins under dosemu. ... .bss wont accept initialisations while .data will but no garantee for modification at runtime. ... Section .bss is nominally "uninitialized" data, but is in fact cleard to zero. ...
    (alt.lang.asm)
  • Re: Buffers in Assembly (NASM)
    ... can create a zero filled buffer in the bss section in NASM with this: ... I am unfamiliar with NASM and I think it also depends on the object ... mov ebx, esp;save the start point of the buffer ...
    (comp.lang.asm.x86)
  • Re: Buffers in Assembly (NASM)
    ... it doesn't know what ".bss" means. ... In other formats, this will ... Nasm to zero it. ... I'm basically just creating a buffer to write strings for functions ...
    (comp.lang.asm.x86)
  • [NASM V.0.98.38] Defining a Stack
    ... define a Stack Segment using NASM V.0.98.38. ... I tried [SEGMENT .bss] but, ... mov ax, my_segment_name ...
    (alt.lang.asm)
  • Re: 8031 question
    ... AUXBUF EQU 0; TRICK - BAFER ... BUFFER EQU 100H; PREPARE FOR OUTPUT ... MOV R0,#TEMPDIV; TEMPERATURU CITA SAMO U KRUGU 1 ... JMP EQSEC; ...
    (sci.electronics.design)