Re: Help to automatically traverse a login session



joe t. wrote:
<snip>
There is a website that requires me to log in using a web-form.
Obviously, POST vars are sent and verified and on success i'm given a
Session and/or Cookie. Within this logged-in area, there are links
leading to data query result pages. "Click here for your recent
transactions" kind of thing.

Those results pages are what i want to get to, but through some kind of
script that parses the results that get served out, not by user
interaction. i want to send a request for a link within that logged in
area and have the results served to my script, then parse out specific
data from those results and in turn serve them to a user in my own
page.
<snip>

Such "web scraping" can be done with cURL <http://in.php.net/curl>
(need to set cookie support). Not all sites would allow web scraping
and will try to block automation with "CAPTCHA" (google it). Some sites
will even use Ajax based rendering which will then make the cURL
process a big tough (though I heard that cURL can work with Mozilla
JavaScript engine). In that case, it will be better to go for Delphi or
VB 6 as we can use WebBrowser component and can automate clicks, etc
with DOM object.

--
<?php echo 'Just another PHP saint'; ?>
Email: rrjanbiah-at-Y!com Blog: http://rajeshanbiah.blogspot.com/

.



Relevant Pages

  • Re: Help to automatically traverse a login session
    ... Session and/or Cookie. ... Within this logged-in area, there are links ... Such "web scraping" can be done with cURL ... <?php echo 'Just another PHP saint'; ...
    (comp.lang.php)
  • Re: Automatic login to a web page
    ... This is mentioned in my isp login page "Cookies must be enabled" ... And with third example [using Curl] ... sub body_callb { ... If it sets a cookie, Curl can handle that too, but it gets more ...
    (comp.lang.perl.misc)
  • Re: [PHP] Help with CURL please
    ... I'm trying to access this URL using CURL and grab ... Now the problem is that in order to access that site, a Cookie must be ... redirects you to http://enterprisedirectory.ucr.edu, sets the cookie ... http://enterprisedirectory.ucr.edu (thus leaving my script) and then ...
    (php.general)
  • Re: CURL and $_SESSION problem
    ... >> the US Federal Government, and you know how the government is! ... > See the headers it sends to its clients (and to index.php CURL calls). ... Once I get the cookie then I ... I can get my own PHPSESSID but I can't get the REMOTE ...
    (comp.lang.php)
  • Re: CURL and $_SESSION problem
    ... >> the US Federal Government, and you know how the government is! ... > See the headers it sends to its clients (and to index.php CURL calls). ... Once I get the cookie then I ... I can get my own PHPSESSID but I can't get the REMOTE ...
    (alt.php)