Re: [newbie] Problems with character output.

From: Alan J. Flavell (flavell_at_ph.gla.ac.uk)
Date: 10/09/04

  • Next message: Peter Scott: "Re: To Tad: Great perl class, and question"
    Date: Sat, 9 Oct 2004 13:10:28 +0100
    
    

    On Sat, 9 Oct 2004, Reven wrote:

    > I've installed ActivePerl on win XP and I'm having some problems.
    > I've tried documentation at activestate but found nothing on this
    > topic.

    Your problem is in using a command window (which by default is
    effectively providing an MS-DOS environment).

    [...]
    > $u = "á"; # The value of this string is an "a" acute
    [...]
    > gives no result. Instead of an "a tilde"

    I think you mean "a-acute" (in iso-8859-1 or windows-1252 coding, that
    would be 0xE1)

    > I get a Greek Beta.

    Here's a clue. Visit
    http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/
    and inspect the CP437 (USA national DOS codepage) or CP850
    (multinational DOS codepage) tables, where you will find that 0xE1
    represents "small letter sharp-s" (the German double-s character),
    which has fooled you by looking rather like a Greek beta.

    > I've also tried "iso-8859-1" and "windows-1252", but to the same
    > effect. I'm quite lost. This is just a wild guess: could there be
    > any problem with the console itself?

    In a sense, yes. This is a longstanding misunderstanding, which
    Microsoft have not put much effort into documenting for the end user:
    right from the start of MS Windows, the DOS command window has
    implemented the MS-DOS character "code pages", which pre-date current
    8-bit character coding conventions (such as iso-8859-x and
    windows-125y for various x and y).

    > Could I be *so* lucky to find a bug?

    At best it could be described as "documented as broken", but the
    documentation is very hard to find if you don't know what you're
    looking for. I haven't studied this issue specifically in XP, but I
    first met it in Win95, and again later (and somewhat differently) in
    Win/NT4.

    You may be able to fool it by changing your DOS window font from its
    initial setting (does yours say "Raster Fonts", as my Win2000 system
    is doing?) to e.g "Lucida Console". However, doing that globally
    might have some unpleasant effects on any software which was actually
    designed to run under DOS (for example, DOS box-drawing characters
    will come out funky).

    There may be some useful terms in this posting that you can use to
    Google for other answers related to this issue. Good luck.


  • Next message: Peter Scott: "Re: To Tad: Great perl class, and question"

    Relevant Pages

    • Re: All programs are undefined, Re: Why this works???
      ... line requires a terminating new-line character. ... and I wish the standard were clearer about what this means. ... the documentation says ``the newline is not ...
      (comp.lang.c)
    • Re: Self-contained Forth for Win7?
      ... for things like character re-definition. ... which is in advanced stages of debugging and documentation... ... needs an editor... ... The editor in TF is in assembler. ...
      (comp.lang.forth)
    • Re: Self-contained Forth for Win7?
      ... for things like character re-definition. ... which is in advanced stages of debugging and documentation... ... Do you think there's any way to write an editor for Ace-Forth IN Ace- ... The editor in TF is in assembler. ...
      (comp.lang.forth)
    • Re: Self-contained Forth for Win7?
      ... for things like character re-definition. ... which is in advanced stages of debugging and documentation... ... for typing in the ASCII code to the stack and then executing GR. ...
      (comp.lang.forth)
    • lastindexof
      ... LastIndexOf method works backwards!!! ... public int LastIndexOf(char value, int startIndex, int count); ... count The number of character positions to examine. ... I disagree that the documentation neglects to mention this fact. ...
      (microsoft.public.dotnet.languages.csharp)