Re: reusing parts of a string in RE matches?



John Salerno wrote:
So my question is, how can find all occurrences of a pattern in a
string, including overlapping matches? I figure it has something to do
with look-ahead and look-behind, but I've only gotten this far:

import re
string = 'abababababababab'
pattern = re.compile(r'ab(?=a)')
m = pattern.findall(string)

This matches all the 'ab' followed by an 'a', but it doesn't include the
'a'. What I'd like to do is find all the 'aba' matches. A regular
findall() gives four results, but really there are seven.

Is there a way to do this with just an RE pattern, or would I have to
manually add the 'a' to the end of the matches?

Yes, and no extra for loops are needed! You can define groups inside
the lookahead assertion:

>>> import re
>>> re.findall(r'(?=(aba))', 'abababababababab')
['aba', 'aba', 'aba', 'aba', 'aba', 'aba', 'aba']

--Ben

.



Relevant Pages