Re: "negative" regexp
- From: Uri Guttman <uri@xxxxxxxxxxxxxxx>
- Date: Thu, 31 Jan 2008 16:11:33 GMT
"PV" == Petr Vileta <stoupa@xxxxxxxxxxxxx> writes:
PV> No, I mean not ideal for using universally. I have concrete goal and I
PV> use as minimal resource as possible. For example if I want to extract
PV> clicable email addresses from html source I need to extract all
PV> /href=['"]*mailto:\s*(.+?)['"\s>/
PV> only.
besides the typo (no close ] on the right), that wouldn't always
work. it allows for an open ' and a closing " which is wrong. it doesn't
handle html comments which shouldn't be parsed for email
addresses. there are other problems with it that i can't get into. so
even a 'simple' thing like that is much harder to extract with a regex
than you think. use a module designed and tested to parse html and email
addresses. it is actually simpler coding from your point of view and
correct as well! and correct beats efficient every day.
uri
PV> HTML:Parser and WWW:Mechanize are good modules but in many case these
PV> are "too big gun" :-)
better a big accurate gun than a tiny pistol with no accuracy. you might
even shoot your eye out!
uri
--
Uri Guttman ------ uri@xxxxxxxxxxxxxxx -------- http://www.sysarch.com --
----- Perl Architecture, Development, Training, Support, Code Review ------
----------- Search or Offer Perl Jobs ----- http://jobs.perl.org ---------
--------- Gourmet Hot Cocoa Mix ---- http://bestfriendscocoa.com ---------
.
- References:
- "negative" regexp
- From: Petr Vileta
- Re: "negative" regexp
- From: Abigail
- Re: "negative" regexp
- From: Petr Vileta
- Re: "negative" regexp
- From: Petr Vileta
- Re: "negative" regexp
- From: Martien Verbruggen
- Re: "negative" regexp
- From: Petr Vileta
- "negative" regexp
- Prev by Date: Re: Magic for object constructor wanted
- Next by Date: Re: ssh into remote nodes, do mulitple commands
- Previous by thread: Re: "negative" regexp
- Next by thread: Re: "negative" regexp
- Index(es):
Relevant Pages
|
|