Re: Print Spanish characters in Perl?
- From: Ben Bullock <benkasminbullock@xxxxxxxxx>
- Date: Sat, 21 Jun 2008 00:12:53 +0000 (UTC)
On Fri, 20 Jun 2008 04:37:34 +0000, Jürgen Exner wrote:
DanB <dbxxxxxxx@xxxxxxxxx> wrote:
I am trying to build a set of Spanish flash cards using TK, and I
need to be able to display the accented characters. I know that I
need to specify them in some unicode besides utf-8,
Actually, you don't. Just put them into your code in your favourite
editor and treat them like any ASCII character.
I suggest *not* doing this, but
use utf8;
and ensure that your file is saved in the UTF-8 format.
A problems arise only if your editor saves the file in a different
encoding then your display device expects.
You are just playing Russian Roulette with the encodings. Playing
Russian Roulette might be a safe hobby. After all there was no bullet
there last time you pulled the trigger, so this time you'll be safe
too. In the same way, you can cross your fingers and hope the
encodings match.
Typical examples are e.g.
saving as UTF-8, then including the text in an HTML page but
forgetting to specify UTF-8 as charset.
To avoid this kind of problem, make sure that all the characters are
encoded into Perl's internal encoding with
use utf8;
and always specify the output encoding you want:
binmode STDOUT,":encoding(utf8)";
binmode STDOUT,":encoding(cp850)";
or
open my $file, ">:encoding(iso-latin-1)", "filename";
If you need to pass a string to some kind of module which doesn't
understand UTF-8 (there are lots of these), then you can decode it into
whatever the module wants with
use Encode 'encode';
encode ("cp850", $string);
Similarly, there is
decode ("cp850", $string);
to go the other way.
I recommend you to keep everything in the Perl code which is under your
control as UTF-8, and don't use anything else. Always
use utf8;
at the top of the script. If your editor accidentally saves the file
in a non-UTF-8 format, then when you try to compile your Perl script
you'll get lots of messages like
Malformed UTF-8 character (unexpected non-continuation byte 0xd1,
immediately after start byte 0xf1) at ./encodings.pl line 8.
Malformed UTF-8 character (unexpected non-continuation byte 0xdc,
immediately after start byte 0xd1) at ./encodings.pl line 8.
Malformed UTF-8 character (unexpected non-continuation byte 0xfc,
immediately after start byte 0xdc) at ./encodings.pl line 8.
Malformed UTF-8 character (5 bytes, need 6, after start byte 0xfc) at ./
encodings.pl line 8.
So
use utf8;
means that Perl can protect you against accidents with text editors.
Here is a small sample script to test it out with:
#!/usr/local/bin/perl
use utf8;
binmode STDOUT,":encoding(cp850)";
my $spanish = "ñÑÜü¿¡«»\n";
print $spanish;
You don't even need to "use warnings;" to get the error messages.
In this case the browser
defaults to ISO-Latin-1 and the non-ASCII characters will be messed
up, of course. Or saving the file as Windows-1252 (or ISO-Latin-1)
and then viewing the output in a DOS Window which for western
languages uses OEM CP 850.
The attitude of a lot of people towards encodings seems to be just ignore
the problem and hope it will go away. That's OK as long as your luck
holds. If you are lucky, the encoding you used for the script may happen
to be the same one as the web browser or whatever expects. If you are
unlucky then they won't be the same. Then you'll get strange bugs, and
you won't know why.
So I suggest that unless you only use ASCII, you should get encodings
under close control. Specify the encoding of your Perl script with
use utf8;
and then specify exactly how you want to encode input and output at each
point. Then there won't be so many unexpected things waiting for you next
time something goes wrong with your code.
.
- Follow-Ups:
- Re: Print Spanish characters in Perl?
- From: Peter J. Holzer
- Re: Print Spanish characters in Perl?
- References:
- Print Spanish characters in Perl?
- From: DanB
- Re: Print Spanish characters in Perl?
- From: Jürgen Exner
- Print Spanish characters in Perl?
- Prev by Date: Re: FAQ 8.2 How come exec() doesn't return?
- Next by Date: FAQ 9.13 How do I edit my .htpasswd and .htgroup files with Perl?
- Previous by thread: Re: Print Spanish characters in Perl?
- Next by thread: Re: Print Spanish characters in Perl?
- Index(es):
Relevant Pages
|