Re: Regular expressin



From: "Manasi Bopardikar" <manasi_bopardikar@xxxxxxxxxxxxxxxx>
Anyone knows regex for---<option value="ACE ">Athletic Coaching Education
(ACE )</option>

If it's HTML use HTML::Parser or some other HTML parsing module, if
it's XML use XML::Rules, XML::Twig, XML::LibXML or some other of the
tens of XML parsing modules. But do NOT attempt to parse neither HTML
nor XML with regexps. It's fragile and much more complex than it
looks at the first glance.

Jenda
===== Jenda@xxxxxxxxxxx === http://Jenda.Krynicky.cz =====
When it comes to wine, women and song, wizards are allowed
to get drunk and croon as much as they like.
-- Terry Pratchett in Sourcery

.



Relevant Pages

  • Re: appendChildren versus innerhtml
    ... You can parse this efficiently using a regular expression containing token ... then apply createElement/appendChild on the ... can put the subpattern matching the longest string first. ... is probably not a concern for parsing HTML. ...
    (comp.lang.javascript)
  • Re: Processing XML thats embedded in HTML
    ... I need to parse a fairly complex HTML page that has XML embedded in ... plain XML, but I cannot get it to work with this HTML page. ... matching grammar and filtering parse action ...
    (comp.lang.python)
  • Re: HTMLParser.HTMLParseError: EOF in middle of construct
    ... is valid HTML or not? ... if so it's a bug on HTMLParser ... may appear in an element's start tag. ... And I have to parse many different sites, I just want extract the links, so ...
    (comp.lang.python)
  • Re: Parsing HTML Files
    ... > My Lists of "Useful URLs" are getting a bit difficult to keep nicely ... > designed) HTML Parser can properly Parse HTML. ... Firefox doesn't quite follow that spec but it's close enough to parse. ...
    (uk.people.silversurfers)
  • Re: HTML parser
    ... > having to do more than I bargained for -especially since, for HTML, ... But ATagParser can parse basically anything with a tag format ... At one time, I created a DOM type tree on top of ATagParser, but ...
    (borland.public.delphi.thirdpartytools.general)