character encodings

From: Jaap Karssenberg (j.g.karssenberg_at_student.utwente.nl)
Date: 11/30/03


Date: Sun, 30 Nov 2003 17:03:02 +0100

I have a script that should read files utf8 compliant, so I used
binmode(FILE, ':utf8'). But now it appears some users have latin2
encoded files, causing some regexes to throw warnings about malformed
utf8 chars. Is there a way to detect the character encoding and DWIM ? I
would hate to have to tell my users they should convert everything to
utf8 first.

-- 
   )   (     Jaap Karssenberg || Pardus [Larus]                | |0| |
   :   :     http://pardus-larus.student.utwente.nl/~pardus    | | |0|
 )  \ /  (                                                     |0|0|0|
 ",.*'*.,"   Proud owner of "Perl6 Essentials" 1st edition :)         


Relevant Pages

  • Re: high-ascii characters in linux terminal via ssh
    ... configured to use UTF8 but for some reason your pager doesn't ... Now that I know what is causing it, I think I've fixed it by telling ...
    (comp.os.linux.misc)
  • Re: trouble figuring out HttpURLConnection
    ... I'm trying this at home tonight, but unfortunately am getting some problems ... (can't find WebConstants, there's a utf8 that's causing me some issues). ...
    (comp.lang.java.programmer)
  • BIG encoding and UTF8?
    ... One of our clients is having a problem with BIG Encoded files being trashed ... after running through our app. ... It was my understanding the UTF8 was supposed to be ...
    (microsoft.public.dotnet.languages.csharp)