Re: Filenames in Ada



Randy Brukardt wrote:

> "Martin Krischik" <krischik@xxxxxxxxxxxxxxxxxxxxx> wrote in message
> news:1653090.31FM62oI6I@xxxxxxxxxxxxxxxxxxxxxx
> ...
>> How does one deal with modern (utf-8) filenames in Ada?
>
> We talked about this briefly at a recent meeting. We decided it was too
> late to design a proper solution, and virtually all operating systems have
> a way to deal with this anyway. (That is that they take UTF-8 names).
> There's no conceptual problem with putting UTF-8 into a value of type
> String, and as someone else showed, that's typcially supported by Ada
> implementations.

Only Windows wants UTF-16 and extra APIs instead . And that is indeed the
problem I have.

Just a question. I get an NAME_ERROR exception in
Ada.Directories.More_Entries. Is that actually correct? Should the
exception not be in Ada.Directories.Simple_Name?

Only when I convert the name to a string it becomes relevant that it can't
be represented into Latin 1.

See source in

http://cvs.sourceforge.net/viewcvs.py/wikibook-ada/demos/Source/make_m3u.adb?only_with_tag=HEAD&view=markup

> A bigger problem is that Ada (any flavor) has no built-in support for
> UTF-8. I think this is a mistake, but I didn't find much support for
> adding any packages.

Well, that can be rectified with external packages. Text_IO is not as easy
to replace

> It's insane to use Wide_Wide_String to store any
> significant amount of text, as it would waste nearly 3/4s of the space in
> typical use, so you'd have to use something else (like UTF-8) for storage
> anyway.

Tell the C99 designers. They though that wchar_t should be large enough to
hold all possible values - without defining what "all possible values"
actually is. As it is GNU C defines it as 32 bit.

> Since filenames are implementation-defined anyway, there isn't a whole lot
> of value to standardizing how they're written. So, it's just up to your
> implementer to do something appropriate.

As long as you don't "push them with the nose on it" as we say in Germany:

"An implementation may support Wide_String or Wide_Wide_String variants for
passing filenames if supported by the platform"

they won't do nothing.

Martin
--
mailto://krischik@xxxxxxxxxxxxxxxxxxxxx
Ada programming at: http://ada.krischik.com
.



Relevant Pages

  • [9fans] Re: 9base ports to unix (flame of byrons rc)
    ... i'm a big fan of rc's history ... Which *NIX tool did ever support unicode? ... > even today there's no Unix tool which really supports UTF8. ... many unix tools do pretty well with utf-8 (thank's rob, ...
    (comp.os.plan9)
  • Re: Unicode-based FreeBSD
    ... than any other Chinese character sets (including traditional and simplified ... The UTF-8 support in FreeBSD/Xorg is good enough for me. ... I can read/type all Unicode 4.0 characters ... There are two reasons to use any character sets other than UTF-8: ...
    (freebsd-current)
  • Re: JFS default behavior / UTF-8 filenames
    ... you should pass it to creatas UTF-8. ... There's no way the kernel can magically fix them. ... to not pass non-UTF8 filenames back to userspace, ... create filenames which are invalid UTF-8. ...
    (Linux-Kernel)
  • Re: Interpretation of extensions different from Unix/Linux?
    ... the use of UTF-8 in this way is the recommendation of the ARG. ... (UTF-8 is a problem of its own in Ada. ... An OS-neutral Ada.Directories would provide enumeration of roots. ... (Windows gives a failure if you try to enumerate an empty floppy disk). ...
    (comp.lang.ada)
  • Re: A Great Idea (tm) about reimplementing NLS.
    ... bytes to encode each Cyrillic character in UTF-8, ... and another user on the same time-sharing system is creating filenames ... writing mail programs would laugh at people who complained that they ... send the line "unsubscribe linux-kernel" in ...
    (Linux-Kernel)