Re: How to find number of characters in a unicode string?
- From: "Leo Kislov" <Leo.Kislov@xxxxxxxxx>
- Date: 10 Oct 2006 22:50:21 -0700
Lawrence D'Oliveiro wrote:
In message <pan.2006.09.18.20.29.20.510034@xxxxxxx>, Marc 'BlackJack'
Rintsch wrote:
In <20060918221814.08625ea2.randhol+valid_for_reply_from_news@xxxxxxx>,
Preben Randhol wrote:
Is there a way to calculate in characters
and not in bytes to represent the characters.
Decode the byte string and use `len()` on the unicode string.
Hmmm, for some reason
len(u"C\u0327")
returns 2.
If python ever provide this functionality it would be I guess
u"C\u0327".width() == 1. But it's not clear when unicode.org will
provide recommended fixed font character width information for *all*
characters. I recently stumbled upon Tamil language, where for example
u'\u0b95\u0bcd', u'\u0b95\u0bbe', u'\u0b95\u0bca', u'\u0b95\u0bcc'
looks like they have width 1,2,3 and 4 columns. To add insult to injury
these 4 symbols are all considered *single* letter symbols :) If your
email reader is able to show them, here they are in all their glory:
க், கா, கொ, கௌ.
.
- Follow-Ups:
- Re: How to find number of characters in a unicode string?
- From: Theerasak Photha
- Re: How to find number of characters in a unicode string?
- References:
- Re: How to find number of characters in a unicode string?
- From: Lawrence D'Oliveiro
- Re: How to find number of characters in a unicode string?
- Prev by Date: Re: Difference between unindexable and unsubscriptable
- Next by Date: Re: sufficiently pythonic code for testing type of function
- Previous by thread: Re: How to find number of characters in a unicode string?
- Next by thread: Re: How to find number of characters in a unicode string?
- Index(es):
Relevant Pages
|
Loading