Re: Quote-aware string splitting



> "J. W. McCall" <jmccall@xxxxxxxxxxxxxx> writes:
> >
> > I need to split a string as per string.strip(), but with a
> > modification: I want it to recognize quoted strings and return them
as
> > one list item, regardless of any whitespace within the quoted
string.
> >
> > For example, given the string:
> >
> > 'spam "the life of brian" 42'
> >
> > I'd want it to return:
> >
> > ['spam', 'the life of brian', '42']
> >
> > I see no standard library function to do this, so what would be the
> > most simple way to achieve this? This should be simple, but I must
be
> > tired as I'm not currently able to think of an elegant way to do
this.
> >
> > Any ideas?
>
> How about the csv module? It seems like it might be overkill, but it
> does already handle that sort of quoting
>
> >>> import csv
> >>> csv.reader(['spam "the life of brian" 42'], delimiter='
').next()
> ['spam', 'the life of brian', '42']
>


I don't know if this is as good as CSV's splitter, but it works
reasonably well for me:

import re
regex = re.compile(r'''
'.*?' | # single quoted substring
".*?" | # double quoted substring
\S+ # all the rest
''', re.VERBOSE)

print regex.findall('''
This is 'single "quoted" string'
followed by a "double 'quoted' string"
''')

George

.



Relevant Pages

  • retain_quoted v0.003
    ... A sed script which escapes blanks and backslashes inside quoted ... string, and column 4 has a quoted string which the user considers one ... read does not understand quoted strings. ...
    (comp.unix.shell)
  • Re: split parameter line with quotes
    ... i'm looking for some way to split up a string into a list of pairs ... Those quoted strings sure are pesky when you try to split along ... args = argList.parseString ... print "maxbuf =", args.maxbuf ...
    (comp.lang.python)
  • Re: Bob needs a new catchphrase
    ... Better handling of quoted strings is probably the biggest for me. ... Lowest level, the string has no commas in it, and doesn't begin with double ... quote: just put the string in as is. ...
    (comp.databases.theory)
  • Re: if /hello/ =~line
    ... You can use string interpolation like you can with double quoted strings. ... puts "found it" ... Ruby in Practice ...
    (comp.lang.ruby)
  • Re: Ignore quoted string in a regular expression
    ... > I couldn't find a usenet group on regular expressions, ... > expression a way to ignore quoted strings. ... blah whatever blah ... > whatever is anywhere in the string and unquoted. ...
    (comp.theory)