Re: Need to pull matched string plus a few additional bytes
- From: arnaldo@xxxxxxxxxxx (Arnaldo Guzman)
- Date: Fri, 27 Oct 2006 10:44:10 -0400
On Fri, 2006-10-27 at 09:36 -0400, Phil Miller wrote:
I am working on my very first program and have run into a bit of a
roadblock. I am trying to print a report of users who show up in an IIS
Log file. The good news is that the format of the userid is
WINDOWSDOMAIN\USERID. The bad news is that it is not always at the same
place in the IIS Log file due to some variable length fields that come
before it. Its location can vary left or right by about 10 bytes.
I read the IIS Log file in one line at a time. I have gotten far enough
that I can identify the lines with WINDOWSDOMAIN on it, but am stuck
there. The code $userid = substr($logfile_in, 33, 12); gets me close
but depending on the length of the date, the time or the IP address, it
is usually off by a few bytes. A sample of the input is below to
explain what I am talking about.
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/main.css
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/contents.aspx
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/footer.aspx
Essentially what I need to do is find the WINDOWSDOMAIN on a line, and
write to a file the matched string plus \USERID data (up to the next
space). Does anyone have any suggestions? I'm thinking there must be
some very easy way to do it since Perl is made for this sort of thing.
I remember reading about some Perl built-in capability that would take a
scalar variable and parse it into an array based on a delimiter, but I
can't remember what it is. That would probably do it for me. But if
you know of a better way, I'm all ears.
Below is the code I am using.
# My first PERL program.
open USERIDOUT, ">userid.out.txt";
open IISLOG, "<ex061023.log";
$ctr = 0;
$hit_counter = 0;
$miss_counter = 0;
$logfile_in;
$userid;
while (<IISLOG>)
{
$logfile_in = $_;
if ( ($logfile_in =~ m/WINDOWSDOMAIN/i && $logfile_in =~
m/itd/i)
)
{
print "\n** Found success\n";
$hit_counter += 1;
$userid = substr($logfile_in, 33, 12);
# This is not correct but is somewhat close
print "\n", $userid;
}
else
{
print "Did not find success\n";
$miss_counter += 1;
}
}
print "\n Hit Counter = ", $hit_counter;
print "\n Miss Counter = ", $miss_counter;
print "\n Total Records Counter = ", $hit_counter + $miss_counter;
close USERIDOUT;
close IISLOG;
*****************************************************
Phil
Confidentiality Notice:
This e-mail and any attachments may contain confidential information intended solely for the use of the addressee. If the reader of this message is not the intended recipient, any distribution, copying, or use of this e-mail or its attachments is prohibited. If you received this message in error, please notify the sender immediately by e-mail and delete this message and any copies. Thank you.
Please be careful with your open(), you will have problems if you do not
use a multiple argument open(), and never forget to check for errors.
And you shouldn't use bare-word filehandles. Now, I don't know if you
want to include the "\" within the second match, but here you go:
--- Using your sample data ---
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/main.css
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/contents.aspx
2006-10-23 12:08:47 24.32.35.123 WINDOWSDOMAIN\USERID 175.128.127.43 80
GET /itd/styles/footer.aspx
--- End of sample data ---
#!/usr/bin/perl -w
use strict;
my $file = 'path/to/file';
my $out_file = 'file/you/want';
open my $iis_log, '<', $file
or die "Could not open $file: $!\n";
open my $user_id_out, '>>', $out_file
or die "Could not open $out_file for writing: $!\n";
while (<$iis_log>) {
if (/(\w+)(\\\w+))/) {
print "Match, printing to $out_file...\n";
# from the sample data above, you will get something
# like, "WINDOWSDOMAIN \USERID" on each line of the out
# file:
print $user_id_out "$1 $2\n";
}
}
Each line will contain two "words", "WINDOWSDOMAIN" and "\USERID". If
you'd like to include the entire string that matched, use $_. Notice the
">>" when opening the OUT file, we want to append each match at the end
of the file.
You should "perldoc perlre" or go here:
http://perldoc.perl.org/perlretut.html - To learn more about Regular
Expressions. Particularly, read the "Extracting Matches" section.
-- Hope that helps.
.
- References:
- Need to pull matched string plus a few additional bytes
- From: Phil Miller
- Need to pull matched string plus a few additional bytes
- Prev by Date: Re: KILLING Processes in unix
- Next by Date: Re: Changing colour, font and weight of text in a shell
- Previous by thread: Need to pull matched string plus a few additional bytes
- Next by thread: Re: Need to pull matched string plus a few additional bytes
- Index(es):
Relevant Pages
|
|