Re: Creating UNICODE filenames with PERL 5.8
From: Allan Yates (allan_at_yates.ca)
Date: 11/17/03
- Next message: John J. Trammell: "Re: What to use when rename fails?"
- Previous message: James E Keenan: "Re: regexp on HTML"
- In reply to: Alan J. Flavell: "Re: Creating UNICODE filenames with PERL 5.8"
- Next in thread: Ben Liddicott: "Re: Creating UNICODE filenames with PERL 5.8"
- Reply: Ben Liddicott: "Re: Creating UNICODE filenames with PERL 5.8"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: 17 Nov 2003 13:01:08 -0800
The key was the missing "-C". I didn't clue in from the documentation
that this was important. Once I added that command line parameter, the
file was created with the correct name.
My next step was to read the file name from the directory. However, I
thought I read in some documentation somewhere that 'readdir' is not
UNICODE aware. I seemed to prove this by reading the directory
containing the file I just created. It comes back with a two character
file name that 'ord' into 0xd8 and 0xb6 as you indicated.
Do you know of a method of reading directories to get the UNICODE file
names?
Thanks,
Allan.
"Alan J. Flavell" <flavell@ph.gla.ac.uk> wrote in message news:<Pine.LNX.4.53.0311171438540.22311@ppepc56.ph.gla.ac.uk>...
> On Mon, 17 Nov 2003, Allan Yates wrote:
>
> > I have been having distinct trouble creating file names in PERL
> > containing UNICODE characters. I am running ActiveState PERL 5.8 on
> > Windows 2000.
>
> N.B I have limited expertise in this specific area, but some of the
> locals around here seem to look to me to answer Unicode questions of
> any kind, so I'll give it a try, as long as you take the answers with
> the necessary grains of salt...
>
> First important question is - have you set the option for wide
> character API in system calls?
>
> > For a simple test, I picked a UNICODE character that could be
> > displayed by Windows Explorer. I can select the character(U+0636) from
>
> that'd be Arabic letter DAD, right?
>
> Its utf-8 representation will be two octets: 0xd8, 0xb6.
>
> > 'charmap' and cut/paste into a filename on Windows Explorer and the
> > character displays the same as it does in 'charmap'. This proves that
> > I have the font available.
>
> (I think that's the least of your worries at the moment...)
>
> > When I attempt to create the same filename with PERL, I end up with a
> > filename two characters long: ض
>
> Those look like 0xd8 and 0xb6 to me...
>
> At a quick glance, I suspect we are seeing the pair of octets that
> represent the character in utf-8 (Perl's internal representation)
> rather than as what Win32 would use, which AIUI is utf-16LE (which in
> this case would come out as 0x3606, IINM). However, I'm not sure that
> (other than for diagnostic purposes) you should ever need to tangle
> with it in that form, since Perl ought to know what to do in a (wide)
> system call.
>
> The system call is evidently treating them as two one-byte characters,
> hence my question about wide system calls. Look for the reference to
> wide system calls in the perlrun page, and the other references to
> which it links.
>
> > I somebody could point me in the correct direction, I would very much
> > appreciate it. I have read the UNICODE documents included with PERL as
>
> OK, but there are also some Win32-specific documents/web-pages that
> come with the ActivePerl distribution. In some situations they might
> be just what you need.
>
> > well searching the newgroups and the web, and everything appears to
> > indicate this should work.
>
> If the above is not the answer, then maybe Win32API::File has
> something for you - but I've never been there myself, so don't pay too
> much attention to that.
>
> > Perl program:
>
> But did you start it with the -C option, or set the wide system calls
> thingy? I think that may prove to be the key.
>
> Good luck, and please report your findings.
- Next message: John J. Trammell: "Re: What to use when rename fails?"
- Previous message: James E Keenan: "Re: regexp on HTML"
- In reply to: Alan J. Flavell: "Re: Creating UNICODE filenames with PERL 5.8"
- Next in thread: Ben Liddicott: "Re: Creating UNICODE filenames with PERL 5.8"
- Reply: Ben Liddicott: "Re: Creating UNICODE filenames with PERL 5.8"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|