Re: Regex help




"Jerry Stuckle" <jstucklex@xxxxxxxxxxxxx> wrote in message
news:K-qdnTSkY4NaoI7anZ2dnUVZ_j6dnZ2d@xxxxxxxxxxxxxx
Steve wrote:
"Jerry Stuckle" <jstucklex@xxxxxxxxxxxxx> wrote in message
news:KaadnQnnGt0WT4_anZ2dnUVZ_tajnZ2d@xxxxxxxxxxxxxx
OK, I give up here. I am DEFINITELY not a Regex expert, and have been
working on this for hours with no luck.

Basically I need to parse a page for certain information which will be
fed back into CURL to post to a site. I need to find four types of tags
on the page:

<input type=hidden name=a1 value=b1>
<input type=text name=a2>
<input type=submit name=a3 value=b3>
<select name=a4>

I don't need any other tags.

From the hidden and submit types, I need name and value. From the text
and select types, I just need the name.

I can assume the attributes will always show up in this order, but there
may be other things between the < and > delimiters. Additionally, the
actual type and name may have single or double quotes around them, or
neither.

Does anyone have some code for this? It doesn't have to be all one
regex.

alright, jer. let's see what we can do...

here's an eyeballed attempt:

<(select\s?[^>].*?)|(input\s[^t]*?type\s*?=\s?('|"|\s)(hidden|text|submit)\3[^>].*?)>

to keep it easier, i'd think about using that to get your general
matches. iterating through those, i'd apply another regex to break out
the name, type, and value. you could very well catch it all in the above,
however, it's not as straightforward and hence, not easily maintained. if
you need additional help on writing this, let me know. i'll psuedo-code
the whole enchillada if you want. this should be sufficient in getting
only those tags you listed above...which is a good start.

btw, make the seach caseINsensitive.

Hi, Steve,

Yep, it's a start. Some problems (output below), but I think it will get
me a little farther.

And you're right, I already gave up on getting everything in one pass. I
was thinking of trying to just get everything for a single element type
(i.e. all <input type=text ...> elements), but this gives me another idea,
also.

And the output from the first try:

Array
(
[0] => Array
(
[0] => <select n
[1] => <select n
[2] => <select n
)

[1] => Array
(
[0] => select n
[1] => select n
[2] => select n
)

[2] => Array
(
[0] =>
[1] =>
[2] =>
)

[3] => Array
(
[0] =>
[1] =>
[2] =>
)

[4] => Array
(
[0] =>
[1] =>
[2] =>
)

)

well, that's no so good a start! i'll break out the old regex ide and fix
that...if you want.


.



Relevant Pages

  • Re: Regex help
    ... Basically I need to parse a page for certain information which ... will be fed back into CURL to post to a site. ... I don't need any other tags. ... i'd apply another regex to break ...
    (comp.lang.php)
  • Re: Regex help
    ... be fed back into CURL to post to a site. ... I don't need any other tags. ... i'd apply another regex to break ... I was thinking of trying to just get everything for a single element ...
    (comp.lang.php)
  • Re: Regex help
    ... I don't need any other tags. ... It doesn't have to be all one regex. ... Hi, Steve, ... I was thinking of trying to just get everything for a single element type, but this gives me another idea, also. ...
    (comp.lang.php)
  • Re: Regex help
    ... I don't need any other tags. ... It doesn't have to be all one regex. ... Hi, Steve, ... I was thinking of trying to just get everything for a single element type, but this gives me another idea, also. ...
    (comp.lang.php)
  • Re: Regex help
    ... Basically I need to parse a page for certain information which will be fed back into CURL to post to a site. ... I don't need any other tags. ... It doesn't have to be all one regex. ... I was thinking of trying to just get everything for a single element type, but this gives me another idea, also. ...
    (comp.lang.php)