Re: re sub help



On 4 Nov 2005 22:49:03 -0800, s99999999s2003@xxxxxxxxx wrote:

>hi
>
>i have a string :
>a =
>"this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"
>
>inside the string, there are "\n". I don't want to substitute the '\n'
>in between
>the [startdelim] and [enddelim] to ''. I only want to get rid of the
>'\n' everywhere else.
>
>i have read the tutorial and came across negative/positive lookahead
>and i think it can solve the problem.but am confused on how to use it.
>anyone can give me some advice? or is there better way other than
>lookaheads ...thanks..
>

Sometimes splitting and processing the pieces selectively can be a solution, e.g.,
if delimiters are properly paired, splitting (with parens to keep matches) should
give you a repeating pattern modulo 4 of
<"everywhere else" as you said><first delim><between><second delim> ...

>>> a = "this\nis\na\nsentence[startdelim]this\nis\nanother[enddelim]this\nis\n"
>>> import re
>>> splitter = re.compile(r'(?s)(\[startdelim\]|\[enddelim\])')
>>> sp = splitter.split(a)
>>> sp
['this\nis\na\nsentence', '[startdelim]', 'this\nis\nanother', '[enddelim]', 'this\nis\n']
>>> ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)])
'thisisasentence[startdelim]this\nis\nanother[enddelim]thisis'
>>> print ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)])
thisisasentence[startdelim]this
is
another[enddelim]thisis

I haven't checked for corner cases, but HTH
Maybe I'll try two pairs of delimiters:

>>> a += "2222\n33\n4\n55555555[startdelim]6666\n77\n8888888[enddelim]9999\n00\n"
>>> sp = splitter.split(a)
>>> print ''.join([(lambda s:s, lambda s:s.replace('\n',''))[not i%4](s) for i,s in enumerate(sp)])
thisisasentence[startdelim]this
is
another[enddelim]thisis222233455555555[startdelim]6666
77
8888888[enddelim]999900

which came from
>>> sp
['this\nis\na\nsentence', '[startdelim]', 'this\nis\nanother', '[enddelim]', 'this\nis\n2222\n33
\n4\n55555555', '[startdelim]', '6666\n77\n8888888', '[enddelim]', '9999\n00\n']

Which had the replacing when not i%4 was true

>>> for i,s in enumerate(sp): print '%6s: %r'%(not i%4,s)
...
True: 'this\nis\na\nsentence'
False: '[startdelim]'
False: 'this\nis\nanother'
False: '[enddelim]'
True: 'this\nis\n2222\n33\n4\n55555555'
False: '[startdelim]'
False: '6666\n77\n8888888'
False: '[enddelim]'
True: '9999\n00\n'

Regards,
Bengt Richter
.



Relevant Pages

  • Re: My own fuction: Access Violation error...
    ... You may consider this code for splitting a string into substrings (and I ... // Split a string into substrings using delimiters. ...
    (microsoft.public.vc.mfc)
  • Re: Beginner code - splitting lines on whitespace
    ... And split each line into a list, returning a list of lists: ... You could you the suggested SPLIT-SEQUENCE code or for a more full-blown ... "Splits string on whitespace, meaning spaces and tabs" ... in the delimiters to this function. ...
    (comp.lang.lisp)
  • Re: How to code an Insert query
    ... Won't this insert double quotes around the value being ... What he is doing is creating a string that will be passed to the ... delimiters in the string you create. ... Msgbox strQuery ...
    (microsoft.public.access.queries)
  • Re: CSV and regex s.split(",") and empty fields
    ... The key point is that there appears to be a BUG in the Microsoft regex implementation of the JScript String.splitmethod. ... var n = a.length; ... delimiters, I believe the code examples I posted on the 30th Aug 09 are ... If the string is using comma to separate name and using quotation mark to ...
    (microsoft.public.scripting.jscript)
  • Re: CSV and regex s.split(",") and empty fields
    ... var n = a.length; ... In relation to splitting delimited files that have quoted fields, delimiters within quotes, empty fields, non-quoted fields, and leading/trailling delimiters, I believe the code examples I posted on the 30th Aug 09 are the best solution I've seen so far. ... If the string is using comma to separate name and using quotation mark to ... Microsoft Online Partner Support ...
    (microsoft.public.scripting.jscript)