Re: Defacto standard string library
- From: richard@xxxxxxxxxxxxxxx (Richard Tobin)
- Date: 3 Jan 2009 18:46:56 GMT
In article <ANN7l.14634$Sp5.13522@xxxxxxxxxxxxxxxxxxxxxxxxx>,
Bartc <bartc@xxxxxxxxxx> wrote:
strcmp() will only work on UTF-8 if you make use of the result as either 0
or not 0.
No, it will give Unicode ordering.
And if you use strcmp() on mixed UTF-8 and ordinary strings, then
the result might be meaningless (a string containing a single encoded
Unicode Character could match a string of several ordinary chars).
If you use strcmp() between strings in different encodings of course
the result is likely to be meaningless. However UTF-8 has the advantage
that it can be compared against ascii, since ascii is a subset of UTF-8.
What I'm saying is that I think it's a bad idea to use C string functions on
strings known to contain UTF-8.
It's a bad idea to use functions that interpret the characters in the
string, and functions that expect the characters to be one byte. But
most of the str* functions don't have those problems.
-- Richard
--
Please remember to mention me / in tapes you leave behind.
.
- Follow-Ups:
- Re: Defacto standard string library
- From: Phil Carmody
- Re: Defacto standard string library
- References:
- Re: Defacto standard string library
- From: user923005
- Re: Defacto standard string library
- From: Keith Thompson
- Re: Defacto standard string library
- From: Bartc
- Re: Defacto standard string library
- Prev by Date: Re: reading a config file
- Next by Date: Re: A bit of fun. A programming puzzle to be done in C.
- Previous by thread: Re: Defacto standard string library
- Next by thread: Re: Defacto standard string library
- Index(es):
Relevant Pages
|