Re: traversing (and accessing values in) a hash of hashes
From: Lawrence Statton (lawrence_at_cluon.com)
Date: 12/12/04
- Next message: John W. Krahn: "Re: traversing (and accessing values in) a hash of hashes"
- Previous message: Jonathan Paton: "Re: traversing (and accessing values in) a hash of hashes"
- Maybe in reply to: James marks: "traversing (and accessing values in) a hash of hashes"
- Next in thread: James marks: "Re: traversing (and accessing values in) a hash of hashes"
- Reply: James marks: "Re: traversing (and accessing values in) a hash of hashes"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
To: James marks <jamarks@jamarks.com> Date: Sat, 11 Dec 2004 17:24:25 -0800
> In the script, there are two variables produced from a regexp match
> against the current line in the access_log:
>
> $new_ip
> $user_agent
>
> and one that is determined by a subroutine that looks for clues that
> the user_agent is a robot:
>
> $user_type
>
> This is the subroutine that builds the hash of hashes. It compares the
> $new_ip in the current line of the access_log with IPs already in the
> hash and updates the user_type/user_agent info if it finds a match and
> creates a new entry if it doesn't.
>
> sub build_user_hash {
> foreach $existing_ip (%user_hash) {
> if ($existing_ip eq $new_ip ) {
> $user_hash {$existing_ip} {user_type} = $user_type;
> $user_hash {$existing_ip} {user_agent} = $user_agent;
> return;
> }
> }
> $user_hash {$new_ip} {user_type} = $user_type;
> $user_hash {$new_ip} {user_agent} = $user_agent;
> $unique_ip_count++;
> }
>
Okay -- now that I have some code to look at, I can give you better advice:
1) use strict;
2) REALLY ... We Mean It. ***USE STRICT***
3) global variables are for 12 year old VB programmers. Avoid them
unless absolutely necessary.
Here's a slightly cleaned up version of your program, with some data
to test with ...
N.B. That you don't want to iterate across the entire %user_hash in
the foreach -- you want to run once for each key, not once for the key
and again for the value ... It turns out, that in a scalar context the
value of each of these keys is a string of the form HASH(0x11fa80).
This string is unlikely to colllide with your actual IP address, and
therefore won't harm your script, but it's Bad Style.
.................................. BEGIN PERL PROGRAM ..................
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
our %user_hash;
our $unique_ip_count;
sub build_user_hash {
my ($new_ip, $user_agent, $user_type) = @_ ;
foreach my $existing_ip ( keys %user_hash) {
if ($existing_ip eq $new_ip ) {
$user_hash {$existing_ip} {user_type} = $user_type;
$user_hash {$existing_ip} {user_agent} = $user_agent;
return;
}
}
$user_hash {$new_ip} {user_type} = $user_type;
$user_hash {$new_ip} {user_agent} = $user_agent;
$unique_ip_count++;
}
# populate the hash with some fake data
while (<DATA>) {
if (my ($ip, $agent, $type ) = m/^(.*) - (.*) - (.*)$/) {
build_user_hash($ip, $agent, $type) ;
}
}
print Dumper \%user_hash;
print "Here come da judge ... \n";
foreach my $key ( keys %user_hash ) {
print "$key \"$user_hash{$key}{user_agent}\"\n"
if $user_hash{$key}{user_type} eq 'robot';
}
__DATA__
192.168.1.1 - ...TOPS-20...EMACS... - human
192.168.1.1 - ...Linux...Opera... - human
192.168.1.3 - ...Yahoo!Slurp... - robot
192.168.1.5 - ...Windows...MSIE... - human
192.168.1.7 - ...Solaris...Mozilla... - human
192.168.1.5 - ...GNU...LWP/0.1... - robot
................................... END PERL PROGRAM ...................
Producing the output:
$VAR1 = {
'192.168.1.1' => {
'user_type' => 'human',
'user_agent' => '...Linux...Opera...'
},
'192.168.1.3' => {
'user_type' => 'robot',
'user_agent' => '...Yahoo!Slurp...'
},
'192.168.1.5' => {
'user_type' => 'robot',
'user_agent' => '...GNU...LWP/0.1...'
},
'192.168.1.7' => {
'user_type' => 'human',
'user_agent' => '...Solaris...Mozilla...'
}
};
Here come da judge ...
192.168.1.3 "...Yahoo!Slurp..."
192.168.1.5 "...GNU...LWP/0.1..."
........................................................................
So -- this program does in fact work, but may not do what you want.
For example: Is using the IP address as the key really a good idea?
Might an IP address be assigned to two different users -- or might a
person use two different user agents on one host? Do you want to deal
with that? Do you want the per-host to be a LIST of browser/flavor
pairs?
Consider the following structure:
our %users = ( '192.168.1.1' => [ { user_type => 'human',
user_agent => '...Browser...' },
{ user_type => 'robot',
user_agent => '...Some Robot...' } ],
'192.168.1.2' => [ { user_type => 'robot',
user_agent => '...Another Robot...' } ],
'192.168.1.3' => [ { user_type => 'human',
user_agent => '...Browser X...' },
{ user_type => 'human',
user_agent => '...Browser Y...' } ] );
How do you think you'd create that and iterate across it...?
Your homework: Starting with that structure above, write a subroutine
that will iterate across it producing the following output:
Hint - it will definitely be easier to explicitly break the structure
into lists and hashes at first. Then squeeze your code to eliminate
the intermediate variables ...
foreach my $key (keys %user_hash ) {
my @list = @{$user_hash{$key}};
.
.
.
}
lawrence /tmp > perl /tmp/ziz.pl
A human at 192.168.1.1 used ...Browser...
A robot at 192.168.1.1 used ...Some Robot...
A human at 192.168.1.3 used ...Browser X...
A human at 192.168.1.3 used ...Browser Y...
A robot at 192.168.1.2 used ...Another Robot...
-- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- --
Lawrence Statton - lawrenabae@abaluon.abaom s/aba/c/g
Computer software consists of only two components: ones and
zeros, in roughly equal proportions. All that is required is to
sort them into the correct order.
- Next message: John W. Krahn: "Re: traversing (and accessing values in) a hash of hashes"
- Previous message: Jonathan Paton: "Re: traversing (and accessing values in) a hash of hashes"
- Maybe in reply to: James marks: "traversing (and accessing values in) a hash of hashes"
- Next in thread: James marks: "Re: traversing (and accessing values in) a hash of hashes"
- Reply: James marks: "Re: traversing (and accessing values in) a hash of hashes"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|