Re: Unicode Support
- From: randyhyde@xxxxxxxxxxxxx
- Date: 20 Apr 2005 12:23:52 -0700
Beth wrote:
>
> In fact, my little "test" which demonstrates that NASM _already_
deals with
> UTF-8 comments and strings, proves the point a different way...some
tools
> _already_ are "accidentally" supporting UTF-8 to that "basic level",
> without even knowing it...and the fact that no-one has actually
realised
> this (I didn't either until I thought it would be interesting to see
what
> NASM would actually do ;), shows how great the "demand" is...if
people were
> regularly wanting UTF-8 source files passed through NASM, then your
post
> should have had Frank shouting "NASM already does it!!"...but, I bet,
not
> even the NASM developers have actually realised that it does already
work
> to this "basic level"...indeed, they could cheekily add it to the
"features
> list": "NASM has basic UTF-8 support!"...as if they actually
"intended" it
> or something...shhh! Don't tell anyone! ;)...
Not knowing much about UTF-8 (my Unicode knowledge extends as far as
UTF-16 and that's about it), I would say that HLA v2.0 would handle
literal strings of this form as long as the character code for quote
can never appear in a MBCS (multibyte character sequence). HLA v1.x,
however, would not be happy with the character as Flex rejects all
character codes in the range $80..$FF out of hand.
There is, however, another issue that gets you into trouble with MBCS.
When you start adding sophisticated compile-time language facilities,
such as string functions, handling all the different character sets
becomes a nightmare. Then, in HLA's case, there is also the issue that
you need to provide standard library routine equivalents of string
functions for UTF-8 strings (you think zero-terminated strings are
painful to compute the length of? Try UTF-8!).
An assembler like NASM, that doesn't provide much in the way of
compile-time string handling, might actually get away with "accidental"
UTF-8 support. But when you've got a sophisticated macro system and
compile-time language, supporting MBCS turns out to be a *lot* of work.
Cheers,
Randy Hyde
.
- Follow-Ups:
- Re: Unicode Support
- From: Beth
- Re: Unicode Support
- References:
- Unicode Support
- From: Chewy509
- Re: Unicode Support
- From: Chewy509
- Re: Unicode Support
- From: Beth
- Unicode Support
- Prev by Date: Re: book on assembly language
- Next by Date: Re: RosAsm is a broken pile of crap
- Previous by thread: Re: Unicode Support
- Next by thread: Re: Unicode Support
- Index(es):
Relevant Pages
|