Re: Feedback please (Linux, NASM code)
- From: "Shay" <spamtrap@xxxxxxxxxx>
- Date: Wed, 22 Nov 2006 15:51:04 -0700
----- Original Message ----- From: "Frank Kotler" <spamtrap@xxxxxxxxxx>
Newsgroups: comp.lang.asm.x86
Sent: Wednesday, November 22, 2006 1:03 PM
Subject: Re: Feedback please (Linux, NASM code)
Shay wrote:I was wondering if anyone would look at the code below and give me some feedback on it?
Pretty good job. Couple minor nits, and one fairly major glitch, that I can see...
section .data
msg: db 'Hello, world',10,0
msglen: equ $-msg
You don't really need the terminating zero, the way you're using this. You might want it there anyway, "for compatibility". If so, you probably don't want to include it in the length. Does no harm, but it's not "right".
So basically, this is something forced upon me by C? I've never really looked at how glibc implements printf() to see if they ever output the \0 or not. I assume the terminal doesn't care if it's there or not.
%define stdout 1
%define SYSCODE_exit 1
%define SYSCODE_write 4
%define EXIT_SUCCESS 0
%define EXIT_FAILURE 1
; works in this code... I wouldn't trust it elsewhere
For a more "sophisticated" sys_??? (and other) macros, have a look at the "asmutils" macros from http://www.linuxassembly.org - aka http://asm.sf.net Personally, I find Nasm's macro syntax highly unreadable ("uglier'n a bulldog's hind end"), but you can do some amazing things with it.
I looked at libASM and there are numerous macros in there. Many for sys_ calls, but I didn't really grasp how they worked so stayed away for now. I will have to look at the link you gave me (or are they the same library?).
%macro SYS_write 1-3
mov ebx, %1
%if %0 > 2
mov ecx, %2
mov edx, %3
%endif
mov eax, SYSCODE_write
int 80h
%endmacro
section .text
global _start
_start:
SYS_write stdout, msg, msglen
cmp eax, 0
jge .GOOD
; This indicates we received an error (negative number)
mov ebx, EXIT_FAILURE
jmp .RET
.GOOD:
I suppose you've tested this with other numbers besides the string length... checking the entire 4G range is a bit tedious...
I tested it with maybe 2 dozen numbers of various sizes and signedness. I was going to write a small loop to test it from -X to X simply by incrementing [eax] and them jmp back to .GOOD label.
call itoa
mov ecx, eax
mov edx, ebx
SYS_write stdout
mov ebx, EXIT_SUCCESS
.RET:
mov eax, SYSCODE_exit
int 80h
;;
;; This function squashes [ebx] and uses it
;; to return the length of the stringified
;; number
This is contrary to the "Intel ABI". This is a somewhat misleading name, IMHO, since the hardware doesn't care, but it's a common convention that you might want to follow. (functions must return ebx, esi, edi, ebp - of course esp - as they found them)
Since you're going to want the length in edx, maybe return it there? (not strictly "Intel ABI", but closer)
Ah yeah.... I remember reading that now. Easy enough change. Of course, I thought I remember reading that C programs want ebx, ecx, edx, esi, edi, ebp not messed with; something about using the first three as 'register' variables. Maybe I just can't return 2 items outside of an ASM -> ASM call (i.e. C -> ASM)... might need to just accept that I have to call strlen() afterward.
;;
itoa:
; Set up stack and local variables
push ebp
mov ebp, esp
sub esp, 48
Good for a "local"/"automatic" variable. Not so great for something you're going to "return" as if it had been "malloc"ed. This memory can be easily overwritten once the function has returned. Since you print it immediately without overwriting it, it'll work okay. In real mode, interrupts use "our" stack - in protected mode, interrupts get their own stack, so it's less of an issue, but it's still not "right".
I think a regular "itoa" takes the address of a buffer to put the result as a parameter. (why the hell can't I find "itoa" in the man pages???) You might want to consider subtracting something from esp in the calling function, and passing that address...
Hmm... so are you saying have the calling function expand the stack to hold 44 bytes as if it passed me a char*? If this is correct, I assume the stack would look something like:
...whatever ...
------------------
44 bytes
------------------
Caller return addr
------------------
as it comes into itoa()? and then I refer to it like [ebp+48] to get the beginning of the 44-byte sequence? i.e.
push ebp
mov ebp, esp
; push other registers here
lea esi, [ebp+48]
..... ret
and then the calling function does:
add esp, 44
Hope I understand it correctly.
push edx
push esi
; Save value passed in for later use
mov dword [ebp - 44], eax
; Get address to end of return ptr
lea esi, [ebp - 43]
Here's the major problem! You start near the bottom of the area you've allocated, and then work down. This will overwrite your local variable and your saved caller's edx and esi (takes a big number to hit esi). I'm not sure why the part that prints '-' or not works, since it's first to be overwritten (but it does???). Since your code doesn't depend on edx or esi, it isn't obvious that this is happening, but it's definitely not right. You want to start nearer the top of your reserved stack area. (if you use this method at all)
Isn't this how it needs to be done since I have no idea what the length of the number coming in is? i.e.:
-100
0
1
1000000
189
when I start calculating the text version of the number, I am always getting the rightmost digit, which needs to go at the end of my reserved storage space. Maybe I am misunderstanding how it all works?
My equivalent C code looks like:
char* itoa(long n){
char str[44];
str[43] = 0;
char* s = str + sizeof(str) - 1; // move to end of str[] -- is this not the equiv of --> lea esi, [ebp - 43] ?
// not sure why I just don't do: char *s = str+43?
unsigned long m;
if (n == LONG_MIN){
m = LONG_MAX + 1UL;
}
else if (n < 0){
m = -n;
}
else {
m = n;
}
do {
*--s = m%10 + '0';
}
while ((m /= 10) > 0);
if (n < 0){
*--s = '-';
}
// code ommitted for brevity
}
mov [esi], byte 0
Again, you don't really need a zero-terminated string...
; Do we have a positive number?
Always doing "signed", eh? Okay. You might want to implement this by doing an unsigned routine which could be called directly for unsigned, and the "signed" routine could just check for negative numbers, "neg eax" and print the '-' if required, then call the unsigned routine to do the rest of it.
cmp eax, 0
jge .ASSIGN
; If not, make it so
neg eax
.ASSIGN:
; Set up divisor
mov ecx, 10
; Counter for return length. This includes
; the trailing '\0'
Yes. Should it?
mov ebx, 1
.DO.WHILE:
mov edx, 0
; [eax] is always being changed to what we
; need so no need to reassign it here
div ecx
; Store remainder in esi
add edx, 48
; Add next number to the sequence
dec esi
mov [esi], dl
; Increase length counter
inc ebx
; Do we need to loop?
cmp eax, 10
jge .DO.WHILE
The "usual" way I do it is to keep looping until eax==0, handling the last digit as a "remainder" as usual. This way may be faster - "div" is quite slow. For discussion of the *fast* way to do "itoa", see the "time this code, please" thread in this NG.
Good point since if eax < 10 at this point it can only be 0.
; Account for a zero being passed in so
; we don't end up with "00"
cmp eax, 0
je .CHECK.NEG
; Store final digit
add eax, 48
dec esi
mov [esi], al
inc ebx
.CHECK.NEG:
cmp dword [ebp - 44], 0
jge .RET
; Add the sign for negative numbers
dec esi
mov byte [esi], '-'
inc ebx
.RET:
; Return value
mov eax, esi
As noted above, there's some question if this is really the way you want to do it.
; Restore stack and kill local variables
pop esi
pop edx
mov esp, ebp
pop ebp
ret
Well, it works. Outside of starting "low" on the stack and working down, everything else is "programmer's choice", but that's a pretty bad bug. Maybe sharper eyes can pick out things I've missed... If it wuz easy, everybody'd be doin' it! :)
Best,
Frank
Thanks a lot Frank. The comments helped out a lot as usual. Still a few areas I am "buggy" on that I hope to clear up soon :)
Shay
.
- Follow-Ups:
- Re: Feedback please (Linux, NASM code)
- From: Frank Kotler
- Re: Feedback please (Linux, NASM code)
- References:
- Feedback please (Linux, NASM code)
- From: Shay
- Re: Feedback please (Linux, NASM code)
- From: Frank Kotler
- Feedback please (Linux, NASM code)
- Prev by Date: Re: Feedback please (Linux, NASM code)
- Next by Date: Re: [Clax86list] Feedback please (Linux, NASM code)
- Previous by thread: Re: Feedback please (Linux, NASM code)
- Next by thread: Re: Feedback please (Linux, NASM code)
- Index(es):
Relevant Pages
|
|