RE: Simple regex problem has me baffled



Rob,

Thanks for your suggestion. It worked!!

# script.pl
8252c
8252c
8252d
8252d
82534
82535
82535
82534
8253c
8253c
8253f
8253f
82542
82543
<- big long list ->

So this is what did the trick:

while (<DATA>) {
next unless /RequestId \[([[:xdigit:]]+)\]/;
print "$1\n";
}

Can you explain why this works but my orginal effort did not?

Many thanks,

Bill Harpley



-----Original Message-----
From: Rob Dixon [mailto:rob.dixon@xxxxxxx]
Sent: Monday, January 26, 2009 7:19 PM
To: Perl Beginners
Cc: Bill Harpley
Subject: Re: Simple regex problem has me baffled

Bill Harpley wrote:
Hello,

I have simple regex problem that is driving me crazy.

I am writing a script to analyse a log file. It contains Java related
information about requests and responses.

Each pair of Request (REQ) and Response (RES) calls have a unique
Request ID. This is a 5 digit hex number contained in square brackets
(e.g. "[81c2d]" ).

Using timestamps in each log entry, I need to calculate the time
difference between the start of the Request and the end of the
Response.

As a first step, I thought I would identify the matching REQ/RES pairs

in the log and then set about extracting the timestamp information and

doing the calculations.

I started with a simple script to extract the Request IDs from each
log entry. Here is what one looks like (names have been changed to
protect the innocent).


[2009-01-23 09:20:48,719]TRACE [server-1] [http-80-5]
anon@xxxxxxxxxxxx
:090123-092048567:f5825 (SetCallForwardStatusImpl.java:call:54) -
RequestId [81e80] SetCallForwardStatus.REQ { accountNumber:=W12345,
phoneNumber:=12121212121, onBusyStatus:=true, busyCurrent:=voicemail,
onNoAnswerStatus:=false, noAnswerCurent:=voicemail,
onUncondStatus:=false, uncondCurrent:=voicemail }

So I need to extract the 5 hex digits in "RequestId [81e80]". Sounds
simple, eh?

Here is a fragment of my initial script:

open ( DATA, "< $INBOX/sample.log") || die "Cannot open source file:
$!";
open ( FILE, "> $INBOX/request.dat") || die "Cannot open request file:
$!";

chomp(@list=<DATA>);

foreach $entry(@list)
{

$entry =~ /\[([a-z0-9]{5})\]/;

print "$1\n"; # print to screen

# print FILE "$1\n"; # print to file
}

I have spent quite a bit of time refining this expression and it looks

OK to me. I basically just need to extract the 5-digit hex string and
then write it to a file (or to screen).

This is what I get when I run the script:

Use of uninitialized value in concatenation (.) or string at
./magic.pl line 16, <DATA> line 1044.

8252c
Use of uninitialized value in concatenation (.) or string at
./magic.pl line 16, <DATA> line 1044.

8252c
Use of uninitialized value in concatenation (.) or string at
./magic.pl line 16, <DATA> line 1044.

8252d
Use of uninitialized value in concatenation (.) or string at
./magic.pl line 16, <DATA> line 1044.

8252d
Use of uninitialized value in concatenation (.) or string at
./magic.pl line 16, <DATA> line 1044.

82534
82534
Use of uninitialized value in concatenation (.) or string at
./magic.pl line 16, <DATA> line 1044.

82535
Use of uninitialized value in concatenation (.) or string at
./magic.pl line 16, <DATA> line 1044.

82534
82534
82534
Use of uninitialized value in concatenation (.) or string at
./magic.pl line 16, <DATA> line 1044.

8253c
8253c
8253c
Use of uninitialized value in concatenation (.) or string at
./magic.pl line 16, <DATA> line 1044


< --- Big long list --note that RequestIDs from REQ/RES pairs need not

be adjacent in the list -- >

The first thing that puzzles me is that it obviously extracting the
RequestId substring correctly, it seems to complain about the "$1\n"
expression in line 16.
This looks quite OK to me and I am baffled why I am getting this
message.

The other thing that puzzles me is that there can only be a single
REQ/RES pair in the file with a given ID. So the RequestID should not
appear more than twice in the The output list. Yet there are many
instances where the RequestID appears more than twice.

Any help you guys can provide would be much appreciated. The Perl
version is 5.8.4. on solaris 10

I think I would write

while (<DATA>) {
next unless /RequestId \[([[:xdigit:]]+)\]/;
print "$1\n";
}

HTH,

Rob
.



Relevant Pages

  • Re: Simple regex problem has me baffled
    ... I am writing a script to analyse a log file. ... Each pair of Request and Response calls have a unique ... Use of uninitialized value in concatenation or string at ./magic.pl ...
    (perl.beginners)
  • Re: Newbie question re initialization
    ... When I submit my script for testing I receive the following warnings: ... Use of uninitialized value in concatenation or string at foo.pl line 19. ... Why is $username undef then? ...
    (comp.lang.perl.misc)
  • Simple regex problem has me baffled
    ... I am writing a script to analyse a log file. ... Each pair of Request and Response calls have a unique ... Use of uninitialized value in concatenation or string at ./magic.pl ...
    (perl.beginners)
  • Re: How do i convert text string to hex values as string?
    ... addition is faster than concatenation" is a valid justification for the use ... because QuickBASIC had by far the best string handling ... middle-level language, such as C. ... More in point, however, VBS is a utility script designed for relatively ...
    (microsoft.public.scripting.vbscript)
  • Re: Script Source Options
    ... Yes, it makes a webserver request on the SRC= and the request is the string, the webserver gets ... the string is a URI to a script with some ...
    (comp.lang.javascript)