Re: MAKEINTRESOURCE in win32asm
From: Beth (BethStone21_at_hotmail.NOSPICEDHAM.com)
Date: 12/24/03
- Next message: Frank Kotler: "Re: I have a problem on how to print the PSP seg."
- Previous message: Frank Kotler: "Re: OT: my new PC rocks!!"
- In reply to: Betov: "Re: MAKEINTRESOURCE in win32asm"
- Next in thread: Betov: "Re: MAKEINTRESOURCE in win32asm"
- Reply: Betov: "Re: MAKEINTRESOURCE in win32asm"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Wed, 24 Dec 2003 08:03:11 -0000
Betov wrote:
> But usually, in real win32 situations these are of no
> practical use, as there is no reason, in Asm Programming
> to have an ID in a Word, instead of a dWord. I suppose,
> this is a left over of the 16 bits days of Win.
Sorry, I don't agree...there is a perfectly good reason for this that
has nothing to do with "16-bit days", which is totally applicable to
any language, including assembly language...
If you actually look it up, then you'll note that these "IDs" are
allowed to be either 16-bit integers _or_ 32-bit ASCIIZ string
addresses...by reserving the first 64KB for system purposes (so that
there couldn't be any valid string addresses with the high word as
zero :), this makes the distinct between the two instantly, allowing
the 32-bit value to be either an integer or an address at the same
time...in this case, it's to allow an "alternative" way of specifying
ID values but the same basic mechanism can be useful in all manner of
situations...
For example, when I was talking about my "type time" idea in the other
thread about accepting user input and processing it as it is being
typed (exploiting "parallelism" and slow user typing to get
"(apparently) instanteous searching"), then I mentioned how to format
a table that simply contains addresses...there would, in fact, be two
"types" of address (a leaf node address points to a procedure, another
addresses simply point to more tables :)...but I suggested that rather
than waste memory with an extra "type" field to specify which is which
(which would also require a second read from memory to complete), the
address field itself can act as both "type" and "address" at the same
time...simply store all your tables at the lower addresses or at the
higher addresses, then a simple, single comparison of the address to
these memory ranges gives us the "type" automatically (and, note, we
only also need _one_ memory read of the address into a register and
the comparison can be made on that value itself...so, we're using less
memory _and_ we're accessing it less often, which is another useful
thing in a PC system where the CPU speed greatly exceeds the memory
speed...and we're less likely to "blow" caches as much as having the
bigger table with a "type" field, etc....an all-round more
CPU-friendly way :)...
And you've used a variation on this theme _all the time_, even if you
haven't noticed it...you must have had some address pointer, which has
the "special value" of NULL (zero)...this is exactly the same
trick...there is a memory address of zero that is accessible to the
CPU, you know...BUT, it's simply more useful to sacrifice that lone
byte right at the start of the address space in order to have a "this
points nowhere" value...well, that's only a single byte, as opposed to
a whole 64KB "range" but it's basically the same trick...
The general idea is to split up a value into different "ranges" which
mean different things...in the case of a typical "C" pointer, the two
ranges are pretty simple: NULL (zero) is one "range" of a single byte,
which has a "special meaning" in representing "nowhere" as the
destination of a pointer...the other "range" is everything else which
is considered a valid address...
In the case of these "MAKEINTRESOURCE" IDs, the first "range" is 0 <=
x < 64K and represents a simple integer value...the other range (x >=
64K) then represents a valid string address...basically, this is the
same "trick" as using a NULL pointer, except that we've expanded the
"special meaning" range to provide 64K integer values (but,
correspondingly, where you'd only lose one byte with a NULL value,
this "integer" range loses you 64KB of valid addresses for your
pointer...this is considered "okay" in Windows because it delibrately
always reserves that first 64KB of memory for its own purposes,
anyway...so you'd be losing those addresses as valid pointer values in
your program, anyway, because Windows is sitting there that you
shouldn't be writing anything to this area or you'd be overwriting
system data :)...
My suggestion of squeezing all the tables together so that the
"address" field can serve _two_ purposes at the same time (to be a
"pointer to procedure" _AND_ a "pointer to another address table" :)
is the same idea once more...but it has a _flexible_ dividing line
between the two address "ranges", defined by how big your tables
are...the "cut-off point" between "pointer to another table" and
"pointer to a procedure" is simply made by seeing if the address is
within the range of memory where all the tables are scrunched up
together in memory (hence, if any pointer is pointing to this area of
memory, then it _must_ be a "pointer to another table" because that's
all this area of memory stores: a whole bunch of tables :)...if they
are, it's another table...if they aren't, then it can't be one of the
tables (it's outside the valid "table range of memory" :), so it's
assumed to be a "pointer to a procedure" instead...defining the
"cut-off point" in these flexible terms means that there's no "lost
range" of valid addresses at all...I mean, it's basic logic...if a
pointer points into the range of memory that's got all the tables in
it, then - unless we have a bug - it _must_ be a "pointer to another
table", right? And if it doesn't point there then we can just assume
that it's got to be the other purpose that these "addresses" could be,
namely pointing to a "leaf node procedure"...
Also, these examples merely split the range in two...you can
conceivably split it up into, basically, 2^n ranges (where n is the
number of bits...so, a byte can be split 256 ways, a word 65536 ways,
a dword 2^32 ways...we'll keep the exponential notation on that last
one because 2^32 is a rather big number I can't be bothered to type
out here ;)...
At this "every possible value is its own range" extreme, in fact, what
you've finally discovered is another common data type...the so-called
"enumeration"...this is the exact same "range" idea but "taken to the
extreme" where every single individual value is its own "special
value"...in fact, very often, the numeric value chosen to represent a
particular enumeration member has _NO_ significance whatsoever...so,
for instance, we could have a bunch of "enumeration members" that are,
ooh, "messages"...the value zero can mean "window created", the value
one can mean "window needs drawing", the value two can mean "the
system has changed the default colours", etc....and in this particular
case, the actual order and value of these messages signifies
_nothing_...that is, you could shuffle those messages and their values
around and that would be just as equally good an implementation as any
other...there's no "numeric" meaning to the values chosen for each
message, other than they all need to be different and consistently
defined so that we can look at a value and know what message that's
supposed to be...
So, yes, you _do_ also know this trick in another context besides
"NULL pointers" too...the "window messages" in Windows are basically
one big "enumeration" (or "enum") type...Microsoft only really avoid
using a literal massive "enum" statement because there are gaps in the
range and also because it's more flexible (i.e. they can define one
set of "messages" in one include file and another set in another
include files and so on...if using the strict "enum" statement in C /
C++ itself, then you'd be forced to put all messages into one massive
file...well, that's inconvenient to how Microsoft need this to work -
for example, "user32.h" defines just the USER messages, "commctrl.h"
defines the extra messages that common controls have, etc. (and
Microsoft would like it to work that way so that programs can "pick
and choose" the include files more selectively, you can include just
"commctrl.h" but completely ignore the "ole.h" stuff as you're not
using OLE components in your program at all ;) - so they simply used
"#define" instead to define each "enumeration member" one by one :)...
None of this so far is related to any programming language
whatsoever...this is a logical programming "trick" that could be used
by any programming language for multiple purposes...unfortunately,
this often immensely useful possibility is missed by some people
because they aren't on the right "wavelength" to pick it up...
Basically, it should always be remembered that a 32-bit value is
simply a string of 32 bits...people often think "it's all numbers!"
but that's actually not at all accurate (just as "it's all binary!"
isn't quite 100% accurate either...binary is a _positional_ notation
system _for numeric values_ but something like the (E)FLAGS register
is a bunch of bitfields...it's overall integer interpretation isn't
important, just whether bits are on or off in certain positions :)...
Now, clearly, when you look in a hex editor at a memory dump or you
want to talk to someone else about some values, then we all default to
"integer numbers" because that's the supremely convenient way to
express a certain set of bits on and bits off in a short, concise,
unambiguous way...but the temptation here is to always think of memory
being "full of numbers"...and that's where you might lose the right
"wavelength" to realise that things are far more versatile and
flexible than that, if you choose it to be...we can split up a "range"
into loads of completely different ranges...how about: NULL means
"transparent", 1-16 means one of the "standard" 16 colours, 17-255 are
the "symbolic" colours such as "default scrollbar colour" or "default
window background colour", then every value above 255 that's a
positive signed value (the high bit is off...so, that's a 255-2G range
for these addresses ;) is considered to be an _address_ to a RGB
triplet in memory and then all the "negative" signed values are "error
codes", which are enumerate: -1 means "colour out of range", -2 means
"red not permitted by this API function", -3 means "blue and green
socks don't match", etc., etc....in that single 32-bit value, we can
represent a _massive_ amount more than simply "a big integer
number"...but to see this, you have to get onto the right "wavelength"
to remember that when you see "FF A8 8C 35" or whatever in your hex
memory dump then they are only being _displayed_ as "numbers"...what's
really in memory is 8 * 4GB of _BITS_ that may be "0 or 1", "on or
off", "Adam or Eve", "Blur or Oasis", "Lakers or Bulls" or anything
you choose...for convenience, as "numbers" _ARE_ a very useful "type"
to have, the CPU has instructions which treat these "bit strings" as
being numbers, providing "ADD", "SUB" and other mathematical
operations that treat them as "numbers"...if you like, when you choose
between "ja" or "jg" (signed or unsigned comparison) then you define
your "integer" to be signed or unsigned by the choice of your
instructions...well, similarly, if you choose the "ADD" instruction,
then you're usually also defining the "type" of your operand to be
"numeric integer"...if all you use on a register are the "bitwise"
operators - "AND", "OR", "TEST", "BT", etc. - then this turns the
interpretaion of the data into "bitfields"...
This is a fundamental principle of "data" and "types": "information" =
"data" + "structure"..."data" has no "type" in and of itself...there's
no "markers" in memory saying "this memory address contains a signed
8-bit integer", "this memory address contains a 24-bit pixel in a
bitmap" or any such thing...all "data" is _typeless_ by
definition...something that a "real" assembly programmer as yourself
should know like the back of your hand...what gives "data" its "type"
is the "structure" (machine instructions provide this "structure"
implicitly from their semantics and operation ;)...and, strictly, if
we're using all the right terminology, then it's no longer "data",
it's now become _information_...it _tells you something_, it has
_meaning_..."data" in and of itself, though, is meaningless...
Hence, a value and the "ranges" values can take is _completely_ in
your hands as a coder in the way you choose to "interpret" that data
with machine instructions (which turns it into "useful information"
for performing your task :)...
[ By the way, if you think the above is an argument _against_ "data
typing" in a programming language, then you're not reading
correctly..."data typing" mechanisms in languages - specifying that a
memory address is a "DWORD" or "HICON" - is a _separate_
matter...that's basically just a simple "sanity check" that some tools
offer...they make you specify the data type and then just double-check
that the instructions you've chosen really do fit in with that chosen
data type...that's just a cheap and cheerful means of trying to
prevent some of the simple, common mistakes and bugs that could arise
in a program...it's hardly foolproof (something to remind the HLL
people) and it's hardly "evil" to offer this "sanity check" (something
to remind "real" coders who insist that it shouldn't be available,
just because they hated using it in Pascal :)...I really like HLA's
"half-way house" on this, though, in having "data typing" available
but if you specify a "typeless" type like "byte", "word" and "dword"
then you effectively "switch it off" and it doesn't care what you do
with those values thereafter...and there's something similar with
procedures in MASM / TASM / HLA that you can _ask_ them to double
check "prototypes" but if all that's too annoying and pointless, you
can just push those parameters yourself and make a "CALL" and
completely by-pass it when it's not at all needed ;)... ]
Being a "real" _practical_ coder and "knowing all the theory" is NOT
at all mutually exclusive...BOTH have lessons - very important
lessons - to teach each other...you'd be wise - be you some "academic
theoretician who never gets their hands dirty with actual code" or
some "practical L33T hacker who's convinced that they can learn
everything from actually just 'doing'" - to always _listen_...note
that "listen" does not necessarily mean "do"...if all that "data
typing" is, in fact, complete nonsense for what you want to do then
switch on the "practical hacker" in you to say "oh, go away! That's
just not needed here at all!"...on the other hand, you ain't likely to
ever get to "optimal" with a search routine if you wave off some
"professor" who's talking about it being a "n(O)" problem according to
all the "theory"...because if they are right then there's the
potential to "borrow" some of that theory to exploit some "pattern" in
your data to crank everything "up to the max"...in other words, "Make
good code, not war!!", people ;)
Beth :)
- Next message: Frank Kotler: "Re: I have a problem on how to print the PSP seg."
- Previous message: Frank Kotler: "Re: OT: my new PC rocks!!"
- In reply to: Betov: "Re: MAKEINTRESOURCE in win32asm"
- Next in thread: Betov: "Re: MAKEINTRESOURCE in win32asm"
- Reply: Betov: "Re: MAKEINTRESOURCE in win32asm"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]