Re: sorting slovak utf

From: Martin v. Löwis (martin_at_v.loewis.de)
Date: 12/08/03


Date: 08 Dec 2003 18:40:44 +0100

Stano Paska <paska@kios.sk> writes:

> import locale
> locale.setlocale(locale.LC_CTYPE, 'sk_SK.utf-8')
> and
> locale.setlocale(locale.LC_CTYPE, ('sk_SK', 'utf-8'))
> but i got "unsupported locale" error
>
> What I must do to get correct sorting result?

You don't need to operate in a UTF-8 locale. Instead, any Slovak
locale will do, provided your system offers locale.strcoll for Unicode
objects (try locale.strcoll(u"", u"")).

In this case, you can convert all strings to Unicode, and then collate
using locale.strcoll.

Alternatively, you could set the locale to any Slovak locale, and use
locale.getpreferredencoding() to find the locale's encoding. Then you
could convert all input strings to that encoding, and use
locale.strcoll to collate them as byte strings.

Regards,
Martin



Relevant Pages

  • Re: How to check variables for uniqueness ?
    ... FI in English typography), so the correct uppercase version of those ... characters is the sequence SS. ... So you at least agree with me that it should be consistent with toUpperCase -- all strings should have a single canonical toUpperCase, a single canonical toLowerCase, both should define equivalence classes on the mixed-case input strings, these should be the SAME equivalence class, and equalsIgnoreCase should implement and embody the corresponding equivalence relation. ... The version that doesn't shouldn't surprise English speakers; the version that does shouldn't surprise anyone familiar with its locale-specific behavior for the locale actually used. ...
    (comp.lang.java.programmer)
  • Re: LANG, locale, unicode, setup.py and Debian packaging
    ... strings always, independent of locale. ... A wxPython treeview control (unicode build) ... os.listdirwith a unicode path passed to it ...
    (comp.lang.python)
  • Collate Module
    ... I've made a few more changes to my little collate module. ... Collate.py - Sorts lists of strings in various ways depending ... To use collate with your user locale you need to call setlocale ... flags = flags.upper.split ...
    (comp.lang.python)
  • Re: LANG, locale, unicode, setup.py and Debian packaging
    ... NTFS and VFAT represent file names as Unicode ... strings always, independent of locale. ... Then, if the locale's encoding cannot decode the file names, you have ...
    (comp.lang.python)
  • Re: comparing binary strings
    ... depend on locale and not be utf-8 interpreted. ... irrelevant wether the strings are utf-8 encoded or not. ... The only time the encoding of the strings is ...
    (comp.lang.perl.misc)