Re: lsort -dictionary




Donal Fellows wrote:
>Darren New wrote:
>> I don't think there is any standard meaning of "dictionary", and
>> certainly not one that spans languages.

>It's been well known for a long time that string collation is
>locale-specific. For example, German and Swedish handle umlauts
>entirely differently, and the rules for Turkish, though very logical in
>their own terms, are really very odd by comparison with most European
>languages.

True, indeed there are languages that do not have a single
ordering system. Japanese has two entirely different orders
for kana between which one may choose.

I've never thought of the rules for Turkish as so odd though.
The alphabetical order is basically the same as for English,
with letters with diacritics sorting after their diacritic-less
counterparts. What are you thinking of?

However, the fact that the order of the primitive elements varies from
writing system to writing system, and that some other principles vary
as well, doesn't mean that there aren't more abstract generalizations
about ordering. For example, there is no writing system in which collation
is normally done from end to beginning.

In any case, in the case of "dictionary" ordering, the observed
variation is not a function of locale. People working in English
in the United States have created sort routines in which what they
call "dictionary" ordering means different things.

--
Bill Poser, Linguistics, University of Pennsylvania
http://www.ling.upenn.edu/~wjposer/ billposer@xxxxxxxxxxxx
.



Relevant Pages

  • Re: i18ned Character Set in DBMS and tables
    ... the fact that there is no Locale ... two-letter codes as defined by ISO-639. ... German people and we have understood each other 'einwandfrei'. ... > languages of this country: German, French and Italian, plus English), and ...
    (comp.lang.java.programmer)
  • Plausibility Check
    ... writing system of the people and civilization I'm writing about. ... therefore the phonology, and less so on the spoken language, grammar ... Second, that armed with only a handful of rules for other languages, ... I'll refer to it as Rosetta. ...
    (sci.lang)
  • Re: Plausibility Check
    ... I have been working on a posthuman science fiction novel for a while ... writing system of the people and civilization I'm writing about. ... therefore the phonology, and less so on the spoken language, grammar ... Second, that armed with only a handful of rules for other languages, ...
    (sci.lang)
  • Re: UCS Identifiers and compilers
    ... Some of their languages have both case ... context dependent glyphs for the same character, ... changed our locale between calls. ...
    (comp.compilers)
  • Re: Locale independence and grammatical structure
    ... in languages with inflections it's quite ... > Making a huge dictionary of all words in all possible contexts may ... So the program is not Locale ... of grammer or language. ...
    (comp.programming)

Loading