Re: HTTP Filtering and Threads...




Quoth Dan <danett18@xxxxxxxxxxxx>:
1) I have a code in perl which is doing a HTTP request and getting a
response and saving in a variable, so I want to filter a specific
value of a field. My code is more or less like this one:

next unless /^<input name/i;

You are trying to parse HTML with regular expressions. This is a very
bad idea. I would strongly recommend using HTML::Parser, or another
module capable of actually parsing HTML.

my ($name, $value) = $_ =~ /input name="(.*)Name" type=.*
value="(.*)">/i;

This will fail because the * regex operator is 'greedy': it always takes
as much text as it can. This is why you are always getting the last
value in your example below: the first .* matches everything from the
first 'name="' all the way to the middle of the last <input> tag.

if ((length($value)) > 1){
$MiddleName = $value;
#Some Stuff Code...
print "$MiddleName";<br><br>
^^^^^^^^
This is not Perl. Please post the *actual* code you ran. It make things
simpler :).

}

However the HTTP request return a HTML code that is more or less like
this:

#Some non relevante HTML stuff...
<input name="$mdName" type="hidden" value="Silva">
#Some non relevante HTML stuff...
<input name="Name" type="hidden" value="Silva">
<input name="mdName" type="hidden" value="Daniel">
#Some non relevante HTML stuff...<code>

The problem is that my code is getting the value of "mdName" which is
"Daniel" and I want it get the value of "$mdName" which is "Silva" and
if it is missing (blank) I want to get the value of "Name" which in
the example also is "Silva". But I never want to get the value of
"mdName" which is "Daniel" and is what always is happening. :(

Obs.: I also tried (without sucess) use:

* my ($name, $value) = $_ =~ input name="\"\$mdName\" type=.*
value="(.*)">/i;

* my ($name, $value) = $_ =~ m/input name=\"\$mdName\" type=.*
value="(.*)">/i;

* my ($name, $value) = $_ =~ input name="\/$mdName\/" type=.*
value="(.*)">/i;

* my ($name, $value) = $_ =~ m/input name="\/$mdName\/" type=.*
value="(.*)">/i;

Uh, why? Don't just randomly try things hoping one will work; instead,
understand what is going wrong and fix it.

2) In the some program I have a piece of code which list all users and
do a loop for call the function which will get detailed information of
each user (the code in question 1 is part of this function). The
snippet is like this one:

# Some irrelevant code stuff...

(my $ruid, @userIDs) = &GetUserList($start, $end);

Don't call subs with &. It was a Perl 4 practice, and has some strange
side-effects in Perl 5.

if ($userIDs[0] == -1) { exit(0); }

foreach $userID (@userIDs) {
&GetUserData($name, $middlename, $lname, $bdate);

Your sub GetUserData seems to be directly updating the variables pased
to it. This is a bad idea as it is not what someone reading the code
will expect. It would be better to return a list and call like

my ($name, $middlename, $lname, $bdate) = GetUserData;

Also, it seems to be getting the value of the user ID from a global
variable: again, it would be better to pass it to the function.


print "$userID\t: $name, $middlename, $lname, $bdate";

# Some irrelevant code stuff...

}

# Some irrelevant code stuff...

The function GetUserData() is really slow, it do HTTP Request, parse
some HTML stuff and the amount of users is big. So I would like to add
thread support to it, in a fashion that I could have for example 8
instances of this code running in paralel. :)

Note that this may well not make it run faster. Unless you have 8
processors (lucky you ;) ), it will just make things slower.

One thing that may be slowing things down is if you are fetching and
parsing the same page many times. You may want to look at the Memoize
module as an easy way of avoiding that.

I had looked at http://perldoc.perl.org/threads.html, but it doesn't
helped so much. I belive I should add the thread support in a fashion
that it work directly with the foreach loop instruction and
GetUserData(), right?

The simplest way to multi-thread the above is something like

use threads;

foreach $userID (@userIDs) {
async {
my ($name, $middlename, $lname, $bdate) =
GetUserData($userID);

print "$userID\t: $name, $middlename, $lname, $bdate";

# Some irrelevant code stuff...
}
}

This will run each request in a new thread; but as you have identified,
the output will come out any which way. If you really want to use
threads, you want to use something like Thread::Queue to pass the
results back to the parent thread, which can then deal with printing
them.

However I want to take care to doesn't overwrite data (in C when we
deal with threads we have some unsafe functions that can overwrite
values - which is not good)...

This is not an issue in Perl. Threads have completely separate
variables: threads in Perl are more like Unix' fork than like
traditional C threading.

3) The Perl2exe (http://www.indigostar.com/perl2exe.htm) is the best
option to convert Perl code to Executables? It really work well? Even
with complicated and sophisticated code (using thread, raw sockets,
windows registry access, etc)?

I've never used perl2exe (I understand it's not free?), but I have had
success with PAR, which you can install from CPAN.

Well, that's my first code in perl, so sorry for ugly/bad code (and
also I'm not a programmer, just a curious:). hehe

That's fine: there's nothing wrong with writing bad code when you are
first learning :). The code you posted isn't half as bad as some we see
in this group, anyway...

Thank you and sorry for amount (of dumb and off-topic) questions.

Not off-topic at all, and not dumb neither.

Ben

.



Relevant Pages

  • Re: Walking a tree and extracting info... Problems
    ... Learn to use the Perl debugger and to use the ... foreach $file (@thefiles) { ... push @lines, $_; # push the data line onto the array ... Perl has allocated "@lines" once for the whole program; when you process the next file in the directory you push the lines on the bottom; the match for the HTML title then fires every time. ...
    (comp.lang.perl.misc)
  • RE: question
    ... well it's really HTML that's the problem. ... > was whether perl was appropriate, not how to do it in perl. ... > this e-mail message or disclose its contents to anybody else. ... > should check this e-mail and any attachments for viruses. ...
    (perl.beginners)
  • Re: Two Perl programming questions
    ... directory names using Perl. ... I can debug through my Perl script and ... How would Perl create the dynamic HTML that I ... Perl is general purpose programming language. ...
    (comp.lang.perl.misc)
  • Re: How to write to drive A: from CGI Perl
    ... >> If that does not look weird to you, then please go back to basics. ... Please (assuming that the quoted text is an actual excerpt from the HTML ... You will have a better version of Perl? ... for HTML or Perl or CGI or anything. ...
    (comp.lang.perl.misc)