Storing $DIGIT variables in arrays

From: Jesse Taylor (jrtaylor_at_tulane.edu)
Date: 01/23/05

  • Next message: Jay: "Re: How to find regex at specific location on line"
    Date: Sun, 23 Jan 2005 22:20:12 +0000
    To: beginners@perl.org
    
    

    Below I have posted the source for a program I am attempting to write
    that will take a list of URL's, grab the pages, and search them for
    email addresses and IP addresses, remove duplicate entries, and store
    the results in a text file. Everything compiles fine and runs without
    any warnings, however my output is not what I expected. When I run it
    with this URL in my list: http://gentoo-solid.no-ip.com/testpage.php ,
    the output file does not contain either of the actual IP addresses and
    instead picks the only one that is NOT an IP address and just takes the
    first four numbers "45.45.45.45". The emails are working fine, however
    and all show up in the output file. Any help on this would be appreciated.

    Thanks,
    Jesse Taylor

    ######--START CODE--######

    #!/usr/bin/perl -w

    #Given a text file containing URLs, this script will extract any IP
    addresses or email address from said URLs

    use LWP::Simple;

    print "Enter location of URL list file: ";
    chomp($infile=<STDIN>);
    open INFILE, $infile
       or die "Could not open file $infile";

    print "Enter location at which to create output file: ";
    chomp($outfile=<STDIN>);
    open OUTFILE, ">$outfile"
       or die "Could not open/create $outfile";

    while($url=<INFILE>)
    {
       chomp($url);
       $html=get("$url")
          or die "Couldn't open page located at $url";

       @ips = $html =~
    /(\d{1,3}[0-255]\.\d{1,3}[0-255]\.\d{1,3}[0-255]\.\d{1,3}[0-255])/g;
    #find and store IP addresses

       @emails = $html =~ /(\w+\@\w+\.\w+)/g; #find email addresses and
    store

       push(@allips, @ips);
       push(@allemails, @emails);

    }
    ####remove duplicate array members####

    for($i=0; $i<(scalar @allips); $i++)
    {
       for ($j=0; $j<(scalar @allips); $j++)
       {
          if ($allips[$i] eq $allips[$j] && $i!=$j)
          {
             splice(@allips, $j, 0);
          }
       }
    }

    ####remove duplicate array members####

    for($i=0; $i<(scalar @allemails); $i++)
    {
       for ($j=0; $j<(scalar @allemails); $j++)
       {
          if ($allemails[$i] eq $allemails[$j] && $i!=$j)
          {
             splice(@allemails, $j, 1);
          }
       }
    }

    ####Store data in output file####

    print OUTFILE "IP Addresses: \n";
    foreach (@allips)
    {
       print OUTFILE "$_\n";
    }

    print OUTFILE "\nEmail Addresses: \n";

    foreach (@allemails)
    {
       print OUTFILE "$_\n";
    }

    close(INFILE);
    close(OUTFILE);


  • Next message: Jay: "Re: How to find regex at specific location on line"

    Relevant Pages

    • RE: Inefficient code?
      ... Subject: Inefficient code? ... line is written to the output file. ... You don't need to store both files in memory and you don't even need the ... Can you interpet to my how the regular expression in the code works. ...
      (perl.beginners)
    • Re: Someone claims "sample rate has no bearing on file size"
      ... Increase the sampling rate and you need to store more files. ... input sample rate going into the encoder doesn't affect the output file ... C'est suisse, et tres, tres precis." ...
      (rec.audio.pro)
    • Re: Backing Up Suse 9.1 What is the best method?
      ... > store the image on a network drive. ... Assuming your drive to backup is /dev/hda ... and you have a chunk of space on /dev/hdc big enough to hold the image, ... The output file could even be on the network, but I would probably do as ...
      (alt.os.linux.suse)
    • Re: Inefficient code?
      ... bkup_node and bkup_db in the other input file. ... line is written to the output file. ... You don't need to store both files in memory and you don't even need the whole ...
      (perl.beginners)