Re: Need to pull matched string plus a few additional bytes



On Fri, 2006-10-27 at 09:36 -0400, Phil Miller wrote:
I am working on my very first program and have run into a bit of a
roadblock. I am trying to print a report of users who show up in an IIS
Log file. The good news is that the format of the userid is
WINDOWSDOMAIN\USERID. The bad news is that it is not always at the same
place in the IIS Log file due to some variable length fields that come
before it. Its location can vary left or right by about 10 bytes.



I read the IIS Log file in one line at a time. I have gotten far enough
that I can identify the lines with WINDOWSDOMAIN on it, but am stuck
there. The code $userid = substr($logfile_in, 33, 12); gets me close
but depending on the length of the date, the time or the IP address, it
is usually off by a few bytes. A sample of the input is below to
explain what I am talking about.



2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/main.css

2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/contents.aspx

2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/footer.aspx



Essentially what I need to do is find the WINDOWSDOMAIN on a line, and
write to a file the matched string plus \USERID data (up to the next
space). Does anyone have any suggestions? I'm thinking there must be
some very easy way to do it since Perl is made for this sort of thing.
I remember reading about some Perl built-in capability that would take a
scalar variable and parse it into an array based on a delimiter, but I
can't remember what it is. That would probably do it for me. But if
you know of a better way, I'm all ears.



Below is the code I am using.





# My first PERL program.



open USERIDOUT, ">userid.out.txt";

open IISLOG, "<ex061023.log";



$ctr = 0;

$hit_counter = 0;

$miss_counter = 0;

$logfile_in;

$userid;



while (<IISLOG>)

{

$logfile_in = $_;

if ( ($logfile_in =~ m/WINDOWSDOMAIN/i && $logfile_in =~
m/itd/i)

)

{

print "\n** Found success\n";

$hit_counter += 1;


$userid = substr($logfile_in, 33, 12);
# This is not correct but is somewhat close

print "\n", $userid;


}

else

{

print "Did not find success\n";

$miss_counter += 1;

}

}



print "\n Hit Counter = ", $hit_counter;

print "\n Miss Counter = ", $miss_counter;

print "\n Total Records Counter = ", $hit_counter + $miss_counter;



close USERIDOUT;

close IISLOG;





*****************************************************

Phil
Confidentiality Notice:
This e-mail and any attachments may contain confidential information intended solely for the use of the addressee. If the reader of this message is not the intended recipient, any distribution, copying, or use of this e-mail or its attachments is prohibited. If you received this message in error, please notify the sender immediately by e-mail and delete this message and any copies. Thank you.

Please be careful with your open(), you will have problems if you do not
use a multiple argument open(), and never forget to check for errors.
And you shouldn't use bare-word filehandles. Now, I don't know if you
want to include the "\" within the second match, but here you go:

--- Using your sample data ---
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/main.css
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/contents.aspx
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/footer.aspx
--- End of sample data ---

#!/usr/bin/perl -w
use strict;

my $file = 'path/to/file';
my $out_file = 'file/you/want';

open my $iis_log, '<', $file
or die "Could not open $file: $!\n";
open my $user_id_out, '>>', $out_file
or die "Could not open $out_file for writing: $!\n";

while (<$iis_log>) {
if (/(\w+)(\\\w+))/) {
print "Match, printing to $out_file...\n";
# from the sample data above, you will get something
# like, "WINDOWSDOMAIN \USERID" on each line of the out
# file:
print $user_id_out "$1 $2\n";
}
}

Each line will contain two "words", "WINDOWSDOMAIN" and "\USERID". If
you'd like to include the entire string that matched, use $_. Notice the
">>" when opening the OUT file, we want to append each match at the end
of the file.

You should "perldoc perlre" or go here:
http://perldoc.perl.org/perlretut.html - To learn more about Regular
Expressions. Particularly, read the "Extracting Matches" section.

-- Hope that helps.

.



Relevant Pages

  • Need to pull matched string plus a few additional bytes
    ... I am working on my very first program and have run into a bit of a ... The good news is that the format of the userid is ... place in the IIS Log file due to some variable length fields that come ... some very easy way to do it since Perl is made for this sort of thing. ...
    (perl.beginners)
  • Looking for a way to add unique persistent cookie to IIS 6.0 log file
    ... I would like this cookie to be logged in every row of data in the ... IIS log file, along with other information.Hopefully, this will allow me to ... In other words, even if a visitor navigates directly to a PDF file, ...
    (microsoft.public.inetserver.iis)
  • Looking for a way to add unique persistent cookie to IIS 6.0 log file
    ... I would like this cookie to be logged in every row of data in the ... IIS log file, along with other information.Hopefully, this will allow me to ... In other words, even if a visitor navigates directly to a PDF file, ...
    (microsoft.public.dotnet.framework)
  • Looking for a way to add unique persistent cookie to IIS 6.0 log file
    ... I would like this cookie to be logged in every row of data in the ... IIS log file, along with other information.Hopefully, this will allow me to ... In other words, even if a visitor navigates directly to a PDF file, ...
    (microsoft.public.inetserver.iis.activeserverpages)
  • Looking for a way to add unique persistent cookie to IIS 6.0 log file
    ... I would like this cookie to be logged in every row of data in the ... IIS log file, along with other information.Hopefully, this will allow me to ... In other words, even if a visitor navigates directly to a PDF file, ...
    (microsoft.public.dotnet.framework.aspnet)