Re: Regular expression to find <tr> tags in 2nd level HTML tables
From: Tad McClellan (tadmc_at_augustmail.com)
Date: 01/23/04
- Next message: Walter Roberson: "Re: pattern matching and grabing value sub"
- Previous message: Tad McClellan: "Re: HELP: from XML to mySQL"
- In reply to: Shannon Jacobs: "Re: Regular expression to find <tr> tags in 2nd level HTML tables"
- Next in thread: John W. Kennedy: "Re: Regular expression to find <tr> tags in 2nd level HTML tables"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Thu, 22 Jan 2004 21:26:57 -0600
[
Non-existent newsgroup removed again.
comp.lang.javascript removed, no JavaScript content.
Followups set.
]
[ I couldn't figure out how to repair the top-posting, so
I just snipped the full-quote at the end.
]
Shannon Jacobs <shanen@my-deja.com> wrote:
[ context: comp.lang.perl was rmgroup'd long ago ]
> I'm so sorry to hear that the Google Groups system has been "neglected
> for many years", as you put it so thoughtfully.
Usenet was administered successfully for 15 years before
DejaNews (google groups).
Google is not a part of Usenet.
Google does not administer Usenet.
Google is an archive of everything appears on Usenet without
regard to whether it is _supposed_ to appear on Usenet or not.
There is a protocol (the "P" in "NNTP") for administering Usenet.
There was a proper protocol message removing the comp.lang.perl
newsgroup many years ago when the other Perl newsgroups were formed.
It appears that you do not have enough experience with Usenet's
operation and history to be acting in an authoritative role.
But NONE of that is of much importance when compared to this:
The clued do not read comp.lang.perl
Why would you want to ask questions where there are few people with clues?
Post your Perl question to comp.lang.perl and 20 newbies will read it.
Post it to comp.lang.perl.(misc|moderated) and hundreds of people who have used
Perl for years will read it.
You choose who you want answers from...
> It really is
> unfortunate that so many people regard Google as a useful information
> resource, isn't it?
I did not say anything at all about Google, so I don't know
what you are going on about there...
> Incidentally, when I finally had a bit of free time this morning, I
> rethought the technical problem and did come up with a trivial
> regex-based solution. It did exactly what I required on the first
> attempt, confirming that the technical problem was pretty much as
> trivial as I had thought it was.
Working once with one set of data is not what most people
would term "confirmation".
Perhaps you have just not tried it with a data set that exposes
it weakness (if any)?
> Why did the newsgroups fail to produce the technically trivial answer?
Because there was no technically trivial answer.
> I asked a simple technical question,
> and wound up being dragged into a religious war about proper ways to
> handle HTML.
It was not a religious war. It was a mathematical war.
> Not very useful.
Mathematics (Set Theory) is very useful for parsing grammars.
> I still regard regular expressions as useful and worthy of further
> study.
Me too. I did not say that they were not.
Regexs are great for tokeninizing, but hopeless for the real parsing.
> I cannot say the same thing about most of the people who
> responded so religiously to my trivial question.
Your question was not as trivial as you seem to think.
(that was the point of many of the followups.)
It can be proven (in the mathematical sense) that a Regular
Expression cannot parse a Context Free Grammar (which HTML is).[1]
You can do it in special highly-restricted cases, but not in
the general case. It is impossible.
[1] but Perl's regular expressions are no longer mathematical
Regular Expressions, they've mutated.
--
Tad McClellan SGML consulting
tadmc@augustmail.com Perl programming
Fort Worth, Texas
- Next message: Walter Roberson: "Re: pattern matching and grabing value sub"
- Previous message: Tad McClellan: "Re: HELP: from XML to mySQL"
- In reply to: Shannon Jacobs: "Re: Regular expression to find <tr> tags in 2nd level HTML tables"
- Next in thread: John W. Kennedy: "Re: Regular expression to find <tr> tags in 2nd level HTML tables"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|