Opening Unicode files?
- From: Ilya Zakharevich <nospam-abuse@xxxxxxxxx>
- Date: Sun, 25 Dec 2011 01:52:10 +0000 (UTC)
Does Perl ship with a simple method of opening Unicode files? E.g., I
would like to have something like
open my $fh, '< :BOM0or(utf8)', $filename
where BOM0or does what Perl itself does for Perl files: it looks for the
first 4 bytes; given that a Perl file starts in ASCII, one can detect
BOMs, can detect UTF-16LE, UTF-16BE, UTF-32LE, UTF-32BE, or see that it
is none of the above (then the arument in parens explains what to do;
e.g., Perl itself does BOM0or(latin1)).
Likewise, if one does not know that the file starts in ASCII, one can
still detect BOM (which does not appear often in the encodings I know)
so one could do :BOMor(utf8). Do not recollect seeing such support
for files open()ed by Perl programs; is there?