Re: F<utf8.pm> is evil (was: XML::LibXML UTF-8 toString() -vs- nodeValue())
- From: Ben Morrow <ben@xxxxxxxxxxxx>
- Date: Fri, 24 Apr 2009 03:06:30 +0100
Quoth Eric Pozharski <whynot@xxxxxxxxxxxxxx>:
On 2009-04-22, Peter J. Holzer <hjp-usenet2@xxxxxx> wrote:
*and* whatever else may be affected. You have to remember one thing
only: Your source code consists of Unicode characters encoded in UTF-8
(or UTF-EBCDIC). Period. Nothing else. Clean and simple.
I wasn't about what to remember. I'm about "doing one thing". I think,
that neither F<utf8.pm> nor F<encoding.pm> do one thing.
Peter has just pointed out that utf8.pm does exactly one thing: it
pushes a :utf8 (*not* :encoding(utf8): this is important, as it means
your source mustn't contain invalid sequences) layer on the DATA
filehandle at the point it is called. Everything else that happens is
simply a natural consequence of that.
You are contradicting yourself. First you say that English is the only
language that fits nicely into US-ASCII, then you say that US-ASCII is
called US-ASCII by accident. It isn't. US-ASCII was developed by an
American institute to write English texts. It is no accident at all that
it only contains characters which are frequently used in English
(technical) texts. And it is no accident that it is called ASCII -
"American Standard Code for Information Interchange". The US- in front
is somewhat redundant, but there were a lot variants of ASCII (e.g., the
ISO-646 encodings), so that serves as a reminder that this is indeed the
orginal American version of the American code.
(maybe I wasn't enough verbose this time) English fits in 7bit
encoding, whatever encoding it would have been. It could be any other
encoding (I did some reading about ASCII history (yes, I know wikipedia
is a vague source)). It could not be any other language.
Admittedly I don't speak it, but I'm fairly sure modern Greek would fit
into 7 bits (if one didn't need to also encode Roman characters).
If you can write your programs in English, please do. Especially if you
That "if" (the latter one) is somewhat offending.
plan to make it open source. Almost every programmer on the world has at
That "open" is somewhat offending.
How so? Those planning to release as open source need to be more
careful to make their code comprehensible to people they've never met.
Someone writing internal proprietary code for a company where $Language
is spoken can reasonably assume any maintainance programmer will speak
$Language; this doesn't apply to open-source code.
least a basic grasp of English. But if for some reason you have to write
"Quotation needed (tm)". Or define "programmer".
Name two languages whose primary documentation isn't in English. (Two
because I can name one: Ruby. Even so, I would wager most Ruby code is
written in English.)
Ben
.
- Follow-Ups:
- References:
- XML::LibXML UTF-8 toString() -vs- nodeValue()
- From: MaggotChild
- Re: F<utf8.pm> is evil (was: XML::LibXML UTF-8 toString() -vs- nodeValue())
- From: Eric Pozharski
- Re: F<utf8.pm> is evil (was: XML::LibXML UTF-8 toString() -vs- nodeValue())
- From: Peter J. Holzer
- Re: F<utf8.pm> is evil (was: XML::LibXML UTF-8 toString() -vs- nodeValue())
- From: Eric Pozharski
- XML::LibXML UTF-8 toString() -vs- nodeValue()
- Prev by Date: Re: Encode::decode() clears scalar being decoded?
- Next by Date: Re: F<utf8.pm> is evil (was: XML::LibXML UTF-8 toString() -vs- nodeValue())
- Previous by thread: Re: F<utf8.pm> is evil (was: XML::LibXML UTF-8 toString() -vs- nodeValue())
- Next by thread: Re: F<utf8.pm> is evil (was: XML::LibXML UTF-8 toString() -vs- nodeValue())
- Index(es):
Relevant Pages
|