Re: F<utf8.pm> is evil (was: XML::LibXML UTF-8 toString() -vs- nodeValue())




Quoth Eric Pozharski <whynot@xxxxxxxxxxxxxx>:
On 2009-04-22, Peter J. Holzer <hjp-usenet2@xxxxxx> wrote:

*and* whatever else may be affected. You have to remember one thing
only: Your source code consists of Unicode characters encoded in UTF-8
(or UTF-EBCDIC). Period. Nothing else. Clean and simple.

I wasn't about what to remember. I'm about "doing one thing". I think,
that neither F<utf8.pm> nor F<encoding.pm> do one thing.

Peter has just pointed out that utf8.pm does exactly one thing: it
pushes a :utf8 (*not* :encoding(utf8): this is important, as it means
your source mustn't contain invalid sequences) layer on the DATA
filehandle at the point it is called. Everything else that happens is
simply a natural consequence of that.

You are contradicting yourself. First you say that English is the only
language that fits nicely into US-ASCII, then you say that US-ASCII is
called US-ASCII by accident. It isn't. US-ASCII was developed by an
American institute to write English texts. It is no accident at all that
it only contains characters which are frequently used in English
(technical) texts. And it is no accident that it is called ASCII -
"American Standard Code for Information Interchange". The US- in front
is somewhat redundant, but there were a lot variants of ASCII (e.g., the
ISO-646 encodings), so that serves as a reminder that this is indeed the
orginal American version of the American code.

(maybe I wasn't enough verbose this time) English fits in 7bit
encoding, whatever encoding it would have been. It could be any other
encoding (I did some reading about ASCII history (yes, I know wikipedia
is a vague source)). It could not be any other language.

Admittedly I don't speak it, but I'm fairly sure modern Greek would fit
into 7 bits (if one didn't need to also encode Roman characters).

If you can write your programs in English, please do. Especially if you

That "if" (the latter one) is somewhat offending.

plan to make it open source. Almost every programmer on the world has at

That "open" is somewhat offending.

How so? Those planning to release as open source need to be more
careful to make their code comprehensible to people they've never met.
Someone writing internal proprietary code for a company where $Language
is spoken can reasonably assume any maintainance programmer will speak
$Language; this doesn't apply to open-source code.

least a basic grasp of English. But if for some reason you have to write

"Quotation needed (tm)". Or define "programmer".

Name two languages whose primary documentation isn't in English. (Two
because I can name one: Ruby. Even so, I would wager most Ruby code is
written in English.)

Ben

.



Relevant Pages

  • Re: [SLE] Farsi [Was: Balmer threatens Asian governments who want to use Linux]
    ... > I don't speak or read the language. ... > would remap the signals of a normal KB to Farsi ... If English worked like this we'd need at least 136 characters, ...
    (SuSE)
  • Re: Arrigo Boito and Franco Faccio
    ... don't conclude that the characters are true to themselves and to art, ... I am not sure that I am properly conveying what I mean by 'language' ... because you're right or wrong about the plays in English, ... however contemporary the human psychology of Shakespeare (and I thnk ...
    (rec.music.opera)
  • Re: Happy Friday..
    ... while hiding behind the computer screens or make fun of my English as the ... English is not the poster's natural language. ... this characters, Even though Sometime someone would learn something along ...
    (microsoft.public.cert.exam.mcse)
  • Re: Happy Friday..
    ... while hiding behind the computer screens or make fun of my English as the ... I have never made fun of anyone's English, ... that English is not the poster's natural language. ... this characters, Even though Sometime someone would learn something along ...
    (microsoft.public.cert.exam.mcse)
  • Re: Russian language support on IIS 6
    ... website language including English. ... Encoding of the Request Entity Body ... Encoding of the Response Entity Body -- this is what you are worried ... Our web based application is written using "classic" asp in English. ...
    (microsoft.public.inetserver.misc)