Re: which book to start with...?
- From: naunetr <wildgoosechase@xxxxxxxxxxxxxxxxxxxxx>
- Date: Fri, 07 Dec 2007 01:35:59 +0530
Frank Kotler wrote:
naunetr wrote:
...will Herbert's elf header still work when linking several object files to form the executable?
No. We could, using Nasm's binary output format, generate a "linkable format" header suitable for static linking, I suppose... but Nasm has that capability. We *can* link to a dynamic library (.so - "shared object" files) Stephen Pelc has expanded on Brian Raiter's "Teensy" work, with an example of calling printf without a linker:
http://www.mpeforth.com/arena/cwgtLinux.txt
Put that on the back burner with the "Teensy" tutorial. This isn't strictly about "Assembly Language", but how to interface with the OS... which we *do* have to do, but we don't really need to know the details to do some "Assembly Language".
okay thanks. luckily linux API is simple to use and resembles the c standard lib routines somewhat. the man page descriptions are written for c programming. but its easy to translate to assembler so far. but what about syscalls that take more than 5 parametres? are the others pushed onto stack like in c (but not in reverse order)?
but it works great. amazing only 248 bytes. the c prog i wrote to duplicate this was ~4kb!
Yup. Look how much more "productive" C was! :)
when syscall macro is used instead of fread size of c executable is down to 2928 bytes, 4 times bigger than asm version. :)
Yeah... You can probably improve on that. Long ago, someone - Jerry Coffin, I think - showed how to make a "small" hello world in C, by writing a custom "crt.o" instead of the default "startup code" - *just* "call main", as I recall - the default code does quite a bit more. Still smaller in asm. (some question whether this is a "big deal" or not, but the opposite "bloat doesn't matter" gives... well, what we see!)
yes its amazing. adobe reader takes up ~150Mb to display a 2Mb pdf file. even it's physical memory use becomes ~50Mb. i guess this is OO mentality taken in extreme... i have 1Gb RAM but the system starts swapping after i have opened about 5 apps like adobe reader. other heavy weights are firefox, openoffice, limewire and all java apps.
...And that's just the ELF header! Herbert feels that every byte that appears in the executable ought to be accounted for in the source code.
i guess thats possible only in machine code?
Well... it's "assembly language". Not "executable code", but "just bytes". Results in "just bytes", that is. Some of the values are the result of calculating expressions at assemble-time, not at run-time. so it isn't really "machine code" in that sense.
The CPU *would* execute it, if we jumped into it. A dos .com file will actually execute an ELF header without any harm - once we get to a "don't care" part of the header, we can insert a "jmp my_16bit_code", and create an executable that will run on dos *and* Linux. A "trick" - not useful for anything I'm aware of.
neat! never knew dos could do that. so executing data in elf header doesnt cause nasty things in dos? i'll try this out in DosEmu...
In general, I agree, but I'm willing to let ld add the "OS cruft", so long as it doesn't touch my actual executable code. I like Herbert's method - it's smaller than what I can get get out of ld, and I like "small".
still when c library is called instead of read syscall the size becomes many times more.
For an interesting experiment, try a hello world with printf and linked statically. Around 400k, IIRC! :)
yes. and it used to be much worse is red hat 7 which i had. static linking would produce a 2 Mb hello world app!
Any use of the shared libraries involves adding some "dynamic linker" code to your executable. It's mostly a "one time" cost, I think, so the difference isn't going to be as many "times" larger in a more extensive program (still...!)
and my java app for the same thing is only 447 bytes but at runtime it takes 200Mb of memory!!
That's impressive - on both ends! I don't know anything about Java. Tried to help my daughter with homework when she was taking a course in Java... Learned just enough to determine that it's not "my cup of tea". Interesting concept, though.
i too dont like it much. but unfortunetely i have to study for a course. my jdk6 install takes 582Mb on disk and each instance of the jvm takes ~200Mb of vitual memory. a hello world app with a infinite loop indicates ~10Mb of physical memory use in top. and if you run another hello world app, another ~200Mb of v.mem and ~10Mb of p.mem are allocated...
but also java has convinient routines for all often used tasks. asm is the other end... you have to write them all yourself. but its fun...
...(Nasm users who are unaware of it will probably be appalled to find that Nasm puts its name in a .comment section...).
i saw that when i looked at the executable in vim's hex viewer. i think the 0.98.39 version didn't do this?
I think it's done it from "day one". Not really sure. Possibly already been removed from the binary you were looking at? (or maybe made with "-f bin"?).
okay it was a guess. but maybe i didn't look at the object file in a hex viewer. anyway now my 0.98.39 install was overwritten by 2.0 compilation. i thought it would go in /usr/local and said $sudo make install_everything but it installed in /usr/bin. also it wrote the docs to /usr/doc/nasm but 0.98.39's docs are still left in /usr/share/doc/nasm. but everything works...
is there any chace that strip can damage execute or object files or is it always safe to use even with s option?
I don't know. I assume strip has been around, and been "massaged" enough to be "safe". I've never had a problem with it. Other tools (objdump?) seem to do less well with stripped executables. Some of the "bloat" we're eliminating is "useful", even though not necessary for the thing to execute.
the reason i asked was because of the --strip-unneeded option which deletes unneeded symbols. so i thought --strip-all would delete some "needed" symbols also and possibly cuase error... but like you said i havent noticed any errors so far and i always use --strip-all.
also noticed that even strip -s doesnt remove the nasm comment from readc. so nasm creates a separate .comment section? what if users code already has a .comment section? nasm adds to it? ;)
I imagine so. (tries it...) Nasm complains about "redefined section name". I don't know how to add to the .comment section - if we can. Adding an arbitrarily named "section foo" with "my comment" in it happens to be contiguous with "The Netwide Assembler..."
can we define the standard sections (.text, .data and .bss) sevaral times? will they be merged when object file is created? also is there any chanvce that .data can be put in read-only memory? because i use it to define buffers also... or should i be safe and put them in .bss as nasm advises?
...test eax, eax ; this just sets the flags
okay i have look this instruction up. i only cmp for now.
"cmp eax, 0" would do the same.
nasm manual says test does a bitwise and of its operands and affects the "flags". so this would do an and with identical values.
Right. "cmp" is an "imaginary sub" - flags are set as if a subtraction were done, but the result isn't stored in the "destination" register. "test" is an "imaginary and" (bitwise) - the result isn't stored, but the flags are set.
js getc_error ; if negative, something bad happened
nasm manual says this jumps to operand when sign flag is set. but i dont understand how the previous test will set the sign flag.
The sign flag is set if the most significant bit in the result is set. If the value is being interpreted as "signed", this would mean it's negative. (it *doesn't* tell us if the value is signed or unsigned, and is meaningless if it's unsigned) "How does the CPU know if a number is signed or unsigned?" It doesn't - *we* have to provide appropriate instructions, when it matters. For example:
so if we have compared two signed values with cmp we should probably use jumps like je, jg, jl, js etc. and for unsigned compares jz, ja, jb etc. am i on the right track. but if we are sure that the result of a cmp between two signed values will give only a positive result then we can use ja, jb and similar?
mov al, 0FFh ; could mean 255 or -1
mov bl, 1
cmp al, bl
ja someplace ; taken 255 greater than 1 (unsigned)
jg someplace ; not taken -1 not greater than 1 (signed)
so both these jumps are "correct" depending on what type of comparison you want. is that right?
never mind, i guess PGU explains this sort of stuff...
Yes.
...nothing_read:
or eax, byte -1 ; same as "mov eax, -1", but shorter
but my c book says that end of file is normal
Yes. Or we'd need bigger hard drives! :)
so shouldnt we return zero instead of -1?
Good point. We could, here, since we aren't likely to encounter a 0 byte in a text file. In a more "generic" filter, which might be suitable to process a binary file, we might get 0 as a valid byte, we couldn't tell from the 0 whether it was a valid byte or EOF. The "movzx" clears the upper part of eax... so anything 0..0FFh is a valid byte, *all* of eax being set to 1s is EOF...
even for binary files i think read will return 0 when the end of file is reached.
also shouldn't popad be just before the ret under nothing_read?
It needs to be "before", but not necessarily "just" before. You'll notice that the "popad" is done before *either* of the returns, but isn't done for the "error exit" (since we don't care). Might be clearer to have a separate "popad" just before each "ret", but it's "right" the way it is.
...okay i think i can generally understand the other portions. in c stdin, stdout and stderr are preopened when app starts. is this the same even when only using kernel?
Yeah.
when writing readc i had thought that i should do a open with 0
I'm not sure how you did that.
before reading but i tried it without and it works, so i guess stdin, stdout and stderr are opened already by kernel when app starts?
Yeah. Seems to be...
basically i thought preopening stdin, stdout and stderr are only c behaviour and since we are not using the c library at all, we should do an open for them. in windows we have to do GetStdHandle before we can write to console. but stdin, stdout and stderr have no pathnames associated for them so open is not possible.
in nasm html manual the x86 instruction refarence is not linked in contents page. i had to load nasmdocb.html directly.
I don't know what version you've got - in latest versions, the instruction set reference has been removed entirely (no volunteers to keep it current). Perhaps you've got a new manual with "nasmdocb.html" left over from an earlier install? Not a bad idea - I find Nasm's version of the instruction set reference useful, even though it isn't up-to-date (and has errors).
this is the docs installed by nasm v2 installation from source. it installed in /usr/doc/nasm and the html contents file (nasmdoc0.html) doesnt have a link to the instruction refarence but i have to manually load the nasmdocb.html file in browser.
There are a number of instruction set references on the web (besides Intel/AMD manuals) This is just the first one I came across...
http://home.comcast.net/~fbui/intel.html
So...I think it's "too bad" that the section was removed from the Nasm manual, but it isn't a total loss...
okay i'll use that then. i used nasm's one because intel's manuals are too big and detailed for me yet :) also i hate pdf files...
thanks a lot.
.
- Follow-Ups:
- Re: which book to start with...?
- From: Frank Kotler
- Re: which book to start with...?
- References:
- which book to start with...?
- From: naunetr
- Re: which book to start with...?
- From: Herbert Kleebauer
- Re: which book to start with...?
- From: naunetr
- Re: which book to start with...?
- From: Frank Kotler
- Re: which book to start with...?
- From: naunetr
- Re: which book to start with...?
- From: Frank Kotler
- which book to start with...?
- Prev by Date: Re: Tasm IDE
- Next by Date: FASMLIB-0.8.0 is out
- Previous by thread: Re: which book to start with...?
- Next by thread: Re: which book to start with...?
- Index(es):
Relevant Pages
|