Re: Confusion between UTF-8 and Unicode

From: Edwin Martin (e.j.martin_at_chello.nl)
Date: 03/17/05


Date: Wed, 16 Mar 2005 23:31:47 GMT

Celia wrote:
> I've looked up UTF-8 and Unicode in the Wikipedia, and at Dictionary.com,
> but I'm not grokking it yet.

UTF-8 is an encoding of Unicode in such a way that a plain ASCII file is
also a valid UTF-8 file (with the same contents, ofcourse).

See also:

The Absolute Minimum Every Software Developer Absolutely, Positively
Must Know About Unicode and Character Sets (No Excuses!)

http://www.joelonsoftware.com/articles/Unicode.html

Edwin Martin

-- 
http://www.bitstorm.org/edwin/en/


Relevant Pages

  • Re: Unicode string libraries
    ... encoding negotiation. ... old languages which have adopted Unicode without much pain. ... compatibility with too many old programs; but char as a holder for UTF-8 ... The limitations of UTF-16 ...
    (comp.programming)
  • Re: convert from utf-8 to unicode(excel)
    ... Is there a possibility to properly convert under Windows from utf-8 ... encoding to unicode ... There is no problem in conversion when I do it in Notepad. ... a file marking encoding as UTF-8 and then save it marking encoding as ...
    (comp.editors)
  • Re: Unicode string libraries
    ... UTF-8 is the encoding that must be used ... I initially thought that the variable-length characters ... but also that UTF-8 didn't break when Unicode got extended ...
    (comp.programming)
  • Re: UTF-8 practically vs. theoretically in the VFS API (was: Re: JFS default behavior)
    ... > the prefered format for storing text in them, ... UTF-8 doesn't measurably expand any normal text that didn't ... (Now, I'll give that a lot of people don't like Unicode, so I'll allow ... Because the _transform_ makes sense regardless of character set encoding. ...
    (Linux-Kernel)
  • Re: Unicode string libraries
    ... I know that Perl uses UTF-8 as its internal string representation. ... characters defined within the BMP). ... search on UTF-8 encodings is equivalent to a search on Unicode ... it makes sense to choose other criteria for your internal encoding. ...
    (comp.programming)