Re: Filenames in Ada
- From: Martin Krischik <krischik@xxxxxxxxxxxxxxxxxxxxx>
- Date: Sun, 27 Nov 2005 11:21:19 +0100
Björn Persson wrote:
> Let's see if I understand the problem. Windows has two functions for
> each file operation, one -A version that expects or returns a file name
> in some 8-bit encoding like Windows-1252, and one -W version that
> expects or returns a file name in UTF-16 or maybe UCS-2?
Well the Windows API in question where designed at a time when UTF-16 and
UCS-2 where still the same - that is Unicode had no codes defined above the
65535 border. At that time programmers did not care - or understood - the
difference between the two.
VFAT-32 is most likely a UCS-2 filesystem (anyone from china to confirm
that?). I remember an article about the "new" VFAT technology wasting
"enormous" amount of storrage using UCS-2 for character encoding.
Obviously the article came from an Latin-1 based country ;-) .
> And all the
> file operations in the Ada library take and return file names as String,
> that is, Latin-1? And Gnat's implementation pretends that Latin-1 is
> identical to whatever 8-bit encoding Windows is using, and passes these
> Strings to Windows' -A functions, leaving you with no way to handle
> filenames that can't be expressed in said 8-bit encoding? Is that right?
Yes indeed. But I take it that on a Russian system the Windows-1251 code
page is active and all filenames are expressed using that and not Latin 1.
> It is my intention to add an encoding-aware interface to Ada.Directories
> under EAstrings.OS. For that to work reasonably on Windows, this problem
> needs to be solved. I suppose I also need to fix this in EAstrings.IO. I
> will need help from a Windows programmer to do this. (Of course I also
> need to get transcoding implemented on Windows before EAstrings will be
> of any use there.)
It is sad that XML/Ada has no UCS-2 and UCS-4 convertion available - but
AdaCL allready has that - so not problem for you really.
> It seems that the right thing to do would be to tap into the Gnat
> library and make UTF-16 (or UCS-2) versions of the file operations. It
> could be as easy as changing the parameter type and replacing calls to
> the Windows functions with their -W equivalents, or it could be very
> hairy.
I had that idea as well and did take a look. Lots of "pragma Import" there.
> We'll need to determine whether it is UTF-16 or UCS-2. This page lists
> code page numbers for a whole lot of encodings, but UTF-16 is missing:
>
>
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_81rn.asp
>
> I take that as a hint that UTF-16 is Windows' idea of wide strings, and
> that all the others are considered "multi-byte character sets" or
> whatever the term is.
Well there seems an better article:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/fileio/fs/naming_a_file.asp
I wonder about that \\?\ stuff and what it really means
Martin
--
mailto://krischik@xxxxxxxxxxxxxxxxxxxxx
Ada programming at: http://ada.krischik.com
.
- Follow-Ups:
- Re: Filenames in Ada
- From: Björn Persson
- Re: Filenames in Ada
- References:
- Filenames in Ada
- From: Martin Krischik
- Re: Filenames in Ada
- From: Björn Persson
- Filenames in Ada
- Prev by Date: Re: [OT] VMS DCL prompt, was: VMS ODS-5 filesystems, was: Re: Filenames in Ada
- Next by Date: Re: Tasking and wxWidgets
- Previous by thread: Re: Filenames in Ada
- Next by thread: Re: Filenames in Ada
- Index(es):
Relevant Pages
|