Re: how is the string encoded



On Jan 3, 10:25 am, Ben Morrow wrote:

That perl (5.8.8) is very nearly six years old. You should
upgrade to at least 5.12.


I wonder whether you realize how difficult (ranging to impossible) it
may be to achieve it. Say, I am on a 3-month contract. The employer
has been managing for years with 5.8.8 and is unlikely to upgrade in
such a case. Once I was stuck with a MySQL server which was many years
old, but my boss was more concerned with preserving his own job than
asking his BOSS to spend time and money on upgrading. Not that the
suggestion to upgrade is wrong or any thing.


If you don't 'use utf8', perl assumes your source is is ISO8859-1. If
you do, it assumes your source is in UTF-8. (In theory you can use other
encodings with the 'use encoding' pragma, but AIUI this doesn't work
reliably.)
...
What did you expect to happen? perldoc utf8 quite clearly says
    Do not use this pragma for anything else than telling Perl that your
    script is written in UTF-8.
so if you 'use utf8' and your source isn't, in fact, *in* UTF-8, you
must expect warnings and misbehaviour.


It is very useful to know that perl assumes the source to be
ISO8859-1. That 'use utf8' arguably works counter-intuitively. Since
my code is ASCII and all ASCII is automatically utf8, I tend to wonder
why I would ever write non-ascii code. It may not be a logical thing
to do but I daresay it is an instinctive thing to do. Now if I want to
dabble in utf8 or databases, what do I do? I think of 'use utf8' or
'use DataBaseInterface DBI'.

What I needed was 'use Encode' which is what I am doing now.
Thanks for all the responses.

.