Re: create a uft8 file



On 8/31/07, Digger <digger@xxxxxxxxxxx> wrote:
Hello,

I want to create an utf8 file and write some content (like html codes) to it.
How to do it? thanks.
snip

Modern Perl handles utf8 natively. You can do things like:

#!/usr/bin/perl

use strict;
use warnings;

open my $fh, ">", "outfile"
or die "Could not open outfile for writing:$!\n";

my $poem =
"ᚠᛇᚻ᛫ᛒᛦᚦ᛫ᚠᚱᚩᚠᚢᚱ᛫ᚠᛁᚱᚪ᛫ᚷᛖᚻᚹᛦᛚᚳᚢᛗ\n" .
"ᛋᚳᛖᚪᛚ᛫ᚦᛖᚪᚻ᛫ᛗᚪᚾᚾᚪ᛫ᚷᛖᚻᚹᛦᛚᚳ᛫ᛗᛁᚳᛚᚢᚾ᛫ᚻᛦᛏ᛫ᛞᚫᛚᚪᚾ\n" .
"ᚷᛁᚠ᛫ᚻᛖ᛫ᚹᛁᛚᛖ᛫ᚠᚩᚱ᛫ᛞᚱᛁᚻᛏᚾᛖ᛫ᛞᚩᛗᛖᛋ᛫ᚻᛚᛇᛏᚪᚾ᛬\n";

print $fh $poem;

without saying anything special and it just works. If you have HTML
entities that you want to translate into utf8 characters (or
vise-versa, there are modules on CPAN that can help you (such as
HTML::Entities*).

from perldoc perluniintro
Starting from Perl 5.8.0, the use of "use utf8" is no longer necessary.
In earlier releases the "utf8" pragma was used to declare that opera‐
tions in the current block or file would be Unicode-aware. This model
was found to be wrong, or at least clumsy: the "Unicodeness" is now
carried with the data, instead of being attached to the operations.
Only one case remains where an explicit "use utf8" is needed: if your
Perl script itself is encoded in UTF-8, you can use UTF-8 in your
identifier names, and in string and regular expression literals, by
saying "use utf8". This is not the default because scripts with legacy
8-bit data in them would break. See utf8.

* http://search.cpan.org/~gaas/HTML-Parser-3.56/lib/HTML/Entities.pm


Relevant Pages

  • Re: unicode conversion
    ... It is not a perl problem, ... In theory, on an utf8 terminal with locale set to an utf8-enabled status, ... breaks utf8 output of Chinese characters to an otherwise perfectly utf8- ... be fine if a complex Perl script could do all its data processing, ...
    (comp.lang.perl.misc)
  • Re: unicode conversion
    ... It is not a perl problem, ... must have Unicode support (that is as complete as ... In theory, on an utf8 terminal with locale set to an utf8-enabled status, ... and my regex contains a single utf8 character then the whole ...
    (comp.lang.perl.misc)
  • Digest-SHA1-2.02 "make test" dumps core
    ... I have a Solaris 9 machine with Sun's stock perl (yes, I know, but I ... chmod 644 SHA1.bs ... 11713 UTF8 wasn't printing for PVMGs ...
    (comp.unix.solaris)
  • Re: [DBI-Users] UTF-8 issues connecting to Oracle
    ... perl: warning: Setting locale failed. ... LANG = "utf8" ... you have to use this when you set LANG! ...
    (perl.dbi.users)
  • Re: utf8 filenames
    ... Is there any way to tell perl that the operand being passed to -e is utf8? ... I think there is currently no way to ensure portability. ... no portable way to figure out which format the file system understands. ...
    (comp.lang.perl.misc)