Re: Converting codepages to UTF8



Dr.Ruud wrote:
P schreef:

Is there a Perl module which implements converting of
codepages (such as you get when running "chcp" in a
command prompt) to UTF8? Something that allows me to
specify, for example, codepage 437 and then converting
it to UTF8. I've looked through the documentation for
the module Encode, but it doesn't seem to deal with
codepages at all.


chcp is a command to change the parameters of the display.

C:\>chcp /? Displays or sets the active code page number.

CHCP [nnn]

nnn Specifies a code page number.

Type CHCP without a parameter to display the active code
page number.


Yes, if you call chcp without a parameter you can establish
the code page. That information is necessary to know what
I'm converting from.


What do you want to do? If you want to convert a file from
one encoding to another, look for 'iconv'.


That's not exactly what I want to do. I have one file, which
is in UTF8, which contains a set of strings. I want to
determine whether any of the strings matches any file name
in a specified directory. Since there can be special
characters in the file names (and in the strings in the UTF8
file), sometimes I'll get false negatives, because a simple
eq on the strings in the UTF8 file and on the file names in
the directory won't match (due to the different encodings).
So I want to normalise the directory listing first (and this
should be dependent on the code page, because different
users might be using different code pages) and compare the
resulting list to the list in the UTF8 file. Does that make
sense? :)


Thank you for your input.


--
Best regards,
Angela Druss

.



Relevant Pages

  • Re: Converting codepages to UTF8
    ... Something that allows me to specify, for example, codepage 437 and ... then converting it to UTF8. ... chcp is a command to change the parameters of the display. ...
    (comp.lang.perl.misc)
  • Re: Converting codepages to UTF8
    ... Is there a Perl module which implements converting of codepages ... then converting it to UTF8. ... but it doesn't seem to deal with codepages at all. ... The resulting string will be in Perl's Unicode format -- keep in mind that while Perl uses UTF-8 internally, Perl treats Unicode strings differently from strings of raw UTF-8 octets. ...
    (comp.lang.perl.misc)
  • Converting codepages to UTF8
    ... Is there a Perl module which implements converting of codepages ... then converting it to UTF8. ... but it doesn't seem to deal with codepages at all. ...
    (comp.lang.perl.misc)