Re: DOCTYPE
- From: Dikkie Dik <dikkie@xxxxxxxxxx>
- Date: Fri, 29 Feb 2008 11:50:13 +0100
I took a brief tour of the web now and see that some Chinese sites have their own charset (which is included in FF), but Korea and Thailand did not. Also, Denmark did not. They all used UTF-8. Is there any overriding principle that determines whether they use UTF-8 or not? Or is it just case by case?
I think this is just case-by-case. It could be dependent on what character sets are installed for Unicode on most systems (I heard that even Klingon characters have their place), but frankly I don't know.
Also, utf-8 allows you to mix character sets, thus rendering Korean and Danish in one sentence if you wish. It could be that the Chinese sites were not that internationally oriented.
I did some international and foreign sites in utf-8, and learned the hard way. The main problem here is that there is a difference between a text and a string. A string is just a sequence of bytes, whereas a text is something that you can read. The difference, off course, is the encoding it is rendered in, and the problem is that texts are stored as strings. So an encoding is usually passed along with the string, and is more "metadata" than part of the value itself. And that metadata is usually sent through separate channels and easily separated and lost. You must often do some work to know the encoding, and even in modern web applications this information can be missing.
By the way, if you want a nice introduction on the matter, here's a good start:
http://www.joelonsoftware.com/articles/Unicode.html
.... and beware of onions ;) There is one thing I strongly disagree with the above site: the remark that character encodings would be easy. They are not. Especially if you take some quirky behaviours of Windows and MySQL into account.
Best regards.
.
- Follow-Ups:
- Re: DOCTYPE
- From: Michael Fesser
- Re: DOCTYPE
- References:
- DOCTYPE
- From: Jerry
- Re: DOCTYPE
- From: Michael Fesser
- Re: DOCTYPE
- From: Jerry
- Re: DOCTYPE
- From: Jerry Stuckle
- Re: DOCTYPE
- From: Jerry
- Re: DOCTYPE
- From: Dikkie Dik
- Re: DOCTYPE
- From: Jerry
- DOCTYPE
- Prev by Date: Re: DOCTYPE
- Next by Date: Re: PHP Ecommerce
- Previous by thread: Re: DOCTYPE
- Next by thread: Re: DOCTYPE
- Index(es):