Re: Help with split using multiple delimiters



In article <1122488128.477319.145360@xxxxxxxxxxxxxxxxxxxxxxxxxxxx>,
<simon.chao@xxxxxxx> wrote:
>
> don't use split--use a regex.
>
> ($a, $b, $c, $d, $e) = $string =~
> /(\S+)\s+(\S+)\s+(\S+)\s+"(.+)"\s+(\S+)/;

If you don't know in advance which fields will be quoted,
you can use this regex instead:

my ($a, $b, $c, $d, $e) = $string =~ /("[^"]*"|\S+)/g;
# but then you need to remove any quotes by saying:
s/^"([^"]*)"$/$1/ foreach $a, $b, $c, $d, $e;

If you don't mind the fields all going in one array, you
could do it all in one go like this:

my @fields;
push @fields, $+ while $string =~ /"([^"]*)"|(\S+)/g;

Of course, nothing stops you then assigning the @fields
array to individual scalar variables:

my ($a, $b, $c, $d, $e) = @fields;

If a single line while loop with a fairly simple regex seems too
easy or too efficient, you can always spend time reading up on
the various CPAN modules suggested by the FAQ (perldoc -q split)
work out how to setup the necessary OO object instances, how
to call the provided methods to get the result you require,
test that it does what you expect, pray that there are no
earlier versions of the module around that are buggy, pray
that no future versions will be buggy, load the whole module
at compile time and hope that this and the method call interface
don't hit performance too much, and then sit back and enjoy
the somewhat dubious pleasures of OPC (Other People's Code)
in the knowledge that at least you didn't have to do the
work yourself. (Irony intended.)

Even if you wanted to use a module, I note that the FAQ
entry "How can I split a [character] delimited string except
when inside [character]?" recommends the use of Text::CVS or
Text::CVS_XS but I don't believe CVS is what's needed here. :-)

--
James Taylor, London, UK PGP key: 3FBE1BF9
To protect against spam, the address in the "From:" header is not valid.
In any case, you should reply to the group so that everyone can benefit.
If you must send me a private email, use james at oakseed demon co uk.

.



Relevant Pages

  • Re: Regular expression to find <tr> tags in 2nd level HTML tables
    ... >> problem with the regex. ... and my source HTML does not include any of the problems covered ... If the FAQ included any examples of the use of ... With regards to the unhelpful advice to stop using Perl, ...
    (comp.lang.perl.misc)
  • Re: Regular expression to find <tr> tags in 2nd level HTML tables
    ... >> problem with the regex. ... and my source HTML does not include any of the problems covered ... If the FAQ included any examples of the use of ... With regards to the unhelpful advice to stop using Perl, ...
    (comp.lang.perl)
  • Re: Finding formatting items in a string
    ... I even found a FAQ on this... ... And a further completed regex, including the fact that you can use within the custom pattern if you want to... ... Just escape them again. ...
    (microsoft.public.dotnet.languages.csharp)
  • Expect regex question
    ... I have a general question about expect. ... and I saw on the Expect FAQ to ask here. ... I need to be able to get the string that matched the regex in the -re ...
    (comp.lang.tcl)
  • Re: Searching large files with a regex and a list
    ... foreach ... Joining multiple RegEx into one like this is _less_ efficient than ... your question is a FAQ) ...
    (comp.lang.perl.misc)