Re: split() and string.whitespace



On Nov 4, 8:00 pm, bearophileH...@xxxxxxxxx wrote:
MRAB:

It's interesting, if you think about it, that here we have someone who
wants to split on a set of characters but 'split' splits on a string,
and others sometimes want to strip off a string but 'strip' strips on
a set of characters (passed as a string).

That can be seen as a little inconsistency in the language. But with
some practice you learn it.

You could imagine that if
Python had had (character) sets from the start then 'split' and
'strip' could have accepted a string or a set depending on whether you
wanted to split on or stripping off a string or a set.

Too bad you haven't suggested this when they were designing Python
3 :-)
This may be suggested for Python 3.1.

I might also add that str.startswith can accept a tuple of strings;
shouldn't that have been a set? :-)

I also had the thought that the backtick (`), which is not used in
Python 3, could be used to form character set literals (`aeiou` =>
set("aeiou")), although that might only be worth while if character
sets were introduced as an specialised form of set.
.



Relevant Pages

  • chapter3
    ... An Informal Introduction to Python ... the hash character, "#", and extend to the end of the physical line. ... string literal is just a hash character. ... Unicode Strings ...
    (Ubuntu)
  • Re: UTF-8 / German, Scandinavian letters - is it really this difficult?? Linux & Windows XP
    ... > If I have this in the beginning of my Python script in Linux: ... > in strings and in Tk GUI button labels and GUI window titles and in ... effect on byte string literals. ... DIAERESIS can be encoded in many different character sets, ...
    (comp.lang.python)
  • Re: Input statement question
    ... I'm still new at Python and have been away from programming for a ... but it's not going to go away until Python 3.0. ... > until the specified ending character is received. ... > always be a string. ...
    (comp.lang.python)
  • Re: Problem with sets and Unicode strings
    ... So this is a bug in Python? ... > statement directly compares a unicode string with a byte string. ... If you put character U+00E4 into a unicode string python ... the ASCII codec, because my file is encoded with UTF-8. ...
    (comp.lang.python)
  • Re: varchar problem
    ... A space is just another character. ... >> want to strip them, ... VARCHAR strings will only be as long as ... > the stored string itself. ...
    (alt.php)