Logging into and parsing a website using Perl

From: Antwerp (bonjo90_at_yahoo.co.in)
Date: 02/15/05

  • Next message: Jim Gibson: "Re: disconnecting rsh session"
    Date: Tue, 15 Feb 2005 15:50:08 -0500
    
    

    Hi,

        I'm trying to create a perl script that will log into a website (the login
    form uses POST), navigate to several pages, and append the (html) content parsed
    from those pages to a seperate log file. I'm not very familiar with this aspect
    of perl, and have been having some trouble in the POSTing of the form data,
    while using cookies to log in.

        Visting the site automatically redirects you to a login page. Once you fill
    out the login form and click the submit button, you are redirected to the main
    site index. The login form uses cookies to establish identity.

        I've searched through several google resources, and have built upon what
    I've read. I *believe* I am now storing the cookies I come across when loading
    the page (using a cookie jar), however, this doesn't seem to be allowing me to
    log on to the secure areas of the site. I suspect this is because I am not
    properly POSTing the login form data, or otherwise not 'following' the redirect
    to the secure area of the site.

        At this point in time, I am just trying to get my script to login to the
    page (using my appropriate credentials), and then display the site index.

        If someone could please offer me some insight and direction into using the
    appropriate modules, or else point out any flaws in my code (included below).

    ------START CODE------

    #!/usr/bin/perl -w

    use LWP::Simple;
    use LWP::UserAgent;

    use HTML::TokeParser;
    use HTML::Parser;

    use HTTP::Request::Common;
    use HTTP::Cookies;

    use POSIX;

    #----Variables----#
    $t_url='http://www.memberplushq.com/pe/register/include/processlogin.jsp';
    # This is where the log in form is located - once logged in, you are
    redirected to the secured content (below).
    $s_url='http://www.memberplushq.com/pe/index.jsp';
    # This is where the secured content is located. If you aren't logged in, you
    are redirected to the above.

    $login='My_username';
    $password='My Password';
    $submit_value='Login';
    #----/Variables----#

    #----User Agent Config----#
    $ua = LWP::UserAgent->new;
    $ua->cookie_jar(HTTP::Cookies->new(file => "cookies.txt", autosave => 1,
    ignore_discard => 1));
    #----/User Agent Config----#

    #----Really posting my buttons----#
    $content = $ua->request(POST $t_url , [ login_name => $login , password =>
    $password , loginSubmit => $submit_value ] );
    $ua->request(POST $t_url , [ login_name => $login , password => $password ,
    loginSubmit => $submit_value ] );
    #----/Really pressing my buttons----#

    #----Completing----#
    print "$content";
    #----Completing----#

    ------END CODE------

    As you can tell, I am *trying* to get through to the secure site. However, this
    is proving to be somewhat interesting.

    I would appreciate any guidance you can offer,

    AntWerp


  • Next message: Jim Gibson: "Re: disconnecting rsh session"