Re: CLisp case sensitivity
From: Adam Warner (usenet_at_consulting.net.nz)
Date: 12/16/04
- Next message: Adam Warner: "Re: CLisp case sensitivity"
- Previous message: Pascal Bourguignon: "Re: CLisp case sensitivity"
- In reply to: Duane Rettig: "Re: CLisp case sensitivity"
- Next in thread: Duane Rettig: "Re: CLisp case sensitivity"
- Reply: Duane Rettig: "Re: CLisp case sensitivity"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Fri, 17 Dec 2004 00:30:51 +1300
Hi Duane Rettig,
Many thanks for the thoughtful reply.
>> It just so happens to have a length of 4 octets in
>> my current locale, UTF-8, a locale that a Unicode Lisp should understand.
>
> No, Ansi Common Lisp does not define "locale", and makes no requirement
> that any conforming implementation support it.
>
>> To the Lisp user you should be presenting a single character with
>> CHAR-CODE of #x10000. If you can not there is no consistency between
>> Unicode 3.1+ implementations of ANSI Common Lisp.
>
> Ansi Common Lisp makes no requirement that char-code-limit be any larger
> than 96. Thus, it explicitly allows conforming lisps not to support
> Unicode.
Of course. But we're discussing the parameters of what makes a conforming
Lisp implementation _also_ conforming with Unicode. If you're not claiming
that Allegro fully supports Unicode 3.1+ then there's no live issue
(because differing semantics are expected). However I don't think you're
claiming this. See below for what I suspect may describe your position.
[big snip]
>> If (char-code (char [ ... ] 0)) is not 65536 then there will be
> ======================^^^^^^^ <== My editing
>> inconsistent results between ANSI Common Lisp Unicode string
>> implementations.
>
> For 7 bit lisps, char-code-limit is likely to be 128. For 8-bit
> lisps, that limit is likely 256. For 16-bit lisps, the limit is
> likely to be 65536. For any of these lisps, if you try (code-char N)
> where N > char-code-limit, then you are writing nonportable code.
[...]
>> Will you also please confirm that SETF
>> CHAR correctly handles the destructive modification of a supplementary
>> character with a Basic Multilingual Plane character (and vice versa)
>> within your internal string representation.
>
> You're barking up the wrong tree. I will confirm this:
>
> CL-USER(1): char-code-limit
> 65536
> CL-USER(2): (code-char 65536)
> NIL
> CL-USER(3):
>
> which is correct behavior. See
> http://www.franz.com/support/documentation/7.0/ansicl/dictentr/code-cha.htm
> and
> http://www.franz.com/support/documentation/7.0/ansicl/dictentr/char-cod.htm
OK, you've demonstrated that "no such character [with code 65536] exists
and one cannot be created, [so] nil is returned." You therefore don't have
an ANSI defined Common Lisp _character_ interface to Unicode supplementary
code points. You can encode them in strings. You just can't represent them
as characters (and therefore one can't, for example, LOOP ACROSS a string
and expect to have a supplementary character of-type CHARACTER returned).
But this doesn't necessarily mean Allegro doesn't fully support the latest
Unicode standard because fully supporting Unicode is an extension to ANSI
Common Lisp. According to this interpretation an implementation is free to
choose any character code limit so long as internally strings can encode
Unicode code points and extensions are provided to, e.g., access those
code points.
Unfortunately this interpretation makes Unicode support vendor specific
and potentially subject to vendor lock in (I know this is furthest from
your mind and you've already raised the issue of making a "32-bit" version
of Allegro CL available, subject to customer demand).
It's unlikely to be in the interests of users to have fragmented Unicode
support when the ANSI standard defines a way to support all Unicode code
points via #\ notation, CODE-CHAR, CHAR-CODE, CHAR, CHAR-CODE-LIMIT, etc.
But so much else is already non-standard in Common Lisp that it would be
just another pity.
I hope we've reached a mutually acceptable understanding.
Regards,
Adam
- Next message: Adam Warner: "Re: CLisp case sensitivity"
- Previous message: Pascal Bourguignon: "Re: CLisp case sensitivity"
- In reply to: Duane Rettig: "Re: CLisp case sensitivity"
- Next in thread: Duane Rettig: "Re: CLisp case sensitivity"
- Reply: Duane Rettig: "Re: CLisp case sensitivity"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|