Re: Regex repeating capture



On Jan 30, 12:13 pm, "Jay" <JaythePC...@xxxxxxxxx> wrote:
I'm trying to break an input string into multpile pieces using a
series of delimiters that start with an asterisk. Following the
asterisk is a mulitple character identifier immediately followed by a
data string of variable length. The input string may contain more than
one identifier anywhere in the string. In all, there are 50+
identifiers to search for and the asterisk is allowed to part of the
data string as long as it isn't defined as an identifier (it would be
treated as another identifier at that point).

Here is a simple example:
*CZ1 2.3 4-56 *fuuuS24364 08 23 72

I'd like to break this into
CZ
1 2.3 4-56
fuuu
S24364 08 23 72

So CZ and fuuu are your delimiters, but only if preceded by an
asterisk, and you want those delimiters to also be in your results?

I have tried the pattern (?:\*(CZ|fuuu)(.*)),

What does that mean? How did you try it? In a list-context pattern
match? In a split? In a scalar-context pattern match with the /g
option? Please show your actual code, not a tiny piece of it.

which produces the
following ouput:
CZ
1 2.3 4-56 *fuuuS24364 08 23 72

How can I force it to repeat the capturing?

Without knowing what you actually did, there's no way to tell you how
to modify it. I will say that the following seems to produce the
results you were looking for, for the data you gave:

perl -le'
my @fields = split /(\*(?:CZ|fuuu))/, q{*CZ1 2.3 4-56 *fuuuS24364 08
23 72};
s/^\*// for @fields;
print for grep { length } @fields;
'
CZ
1 2.3 4-56
fuuu
S24364 08 23 72

perldoc -f split
perldoc -f grep
perldoc perlretut

Paul Lalli

.



Relevant Pages

  • Re: Regex repeating capture
    ... series of delimiters that start with an asterisk. ... Following the asterisk is a mulitple character identifier immediately followed by a ... data string of variable length. ...
    (comp.lang.perl.misc)
  • Re: Regex repeating capture
    ... series of delimiters that start with an asterisk. ... data string of variable length. ... one identifier anywhere in the string. ...
    (comp.lang.perl.misc)
  • Re: Regex repeating capture
    ... The data capture can only end if it encounters another ... valid identifier. ... >>> series of delimiters that start with an asterisk. ... The input string may contain more>>> than ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Regex repeating capture
    ... The identifier is at least 2 character, ... series of delimiters that start with an asterisk. ... The input string may contain more than ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Regex repeating capture
    ... an identifier, otherwise it is considered another identifier. ... to split your string on the asterisk as a first step. ... public bool StartsWith ( ...
    (microsoft.public.dotnet.languages.csharp)