Re: array and hash patter matching



Tim Wolak wrote:
Morning all,

Hello,

I am working on a script that reads in /var/log/auth.log,, takes the ip
addresses puts them into a hash keeping track of how many times it finds
that address and compare it to addresses found in /etc/hosts.deny and
only write the addresses that are new in the file. So far I can get the
addresses from the log file no problem and write them to the deny file,
however I am struggling on how to compare the hash with an array for any
duplicate addresses. What is the best approach to take with this?


[ Code reformatted to reflect the actual structure. ]

use warnings;
use strict;

open (LOGFILE, "/var/log/auth.log") or die "Can't open log file : $!\n";
open (DENY, "/etc/hosts.deny") or die "Can't open log file: $!\n";

while (<DENY>) {
if ($_ =~ /Invalid user/ || /Failed password for/) {

Why use "$_ =~" in front of the first match and not in front of the second
match? Either use it for both or use it for neither (be consistent.) The
file /etc/hosts.deny doesn't even contain those strings does it?

man 5 hosts_access


push @origDeny, $_;
}
foreach $orig (@origDeny) {

Why are you using this foreach loop inside the while loop? If the file
contains five IP addresses then the first one will be pushed onto @hosts 5
times and the second one 4 times and the third one 3 times, etc.

if ($off =~ /((\d+)\.(\d+)\.(\d+)\.(\d+))/) {

Why are you capturing five different strings when you are only using one?

push @hosts, $1;
}
}
}

The two arrays you just populated are not used again after the while loop ends
so what was the point?


close DENY;
while (<LOGFILE>) {
if ($_ =~ /Invalid user/ || /Failed password for/) {
push @offenders, $_;
}
}
foreach $off (@offenders) {
if ($off =~ /((\d+)\.(\d+)\.(\d+)\.(\d+))/) {
push @list, $1;
}
}
foreach $number (@list) {
if (exists $iplist{$number}) {
$iplist{$number} ++;
} else {
$iplist{$number} = "1";
}
}

Why use three loops to do something that you only need one loop for?

my %iplist;
while ( <LOGFILE> ) {
if ( /Invalid user|Failed password for/ && /(\d+\.\d+\.\d+\.\d+)/ ) {
$iplist{ $1 }++;
}
}


open (DENY, ">>/etc/hosts.deny") or die "Can't open log file: $!\n";
foreach $key (keys %iplist) {
if ($iplist{$key} > 5) {

Why 5?

foreach $tim (@list) {
if ($tim !~ /$iplist{$key}/) {

Why are you trying to match the number in $iplist{$key} to the IP address in $tim?

print DENY "$key\n";

According to hosts_access(5) the /etc/hosts.deny file needs more on the line
than just the IP address.

man 5 hosts_access

[ snip ]

ACCESS CONTROL RULES
Each access control file consists of zero or more lines of text. These
lines are processed in order of appearance. The search terminates when
a match is found.

· A newline character is ignored when it is preceded by a
backslash character. This permits you to break up long lines so
that they are easier to edit.

· Blank lines or lines that begin with a `#´ character are
ignored. This permits you to insert comments and whitespace so
that the tables are easier to read.

· All other lines should satisfy the following format, things
between [] being optional:

daemon_list : client_list [ : shell_command ]

daemon_list is a list of one or more daemon process names (argv[0]
values) or wildcards (see below).

client_list is a list of one or more host names, host addresses,
patterns or wildcards (see below) that will be matched against the
client host name or address.

The more complex forms daemon@host and user@host are explained in the
sections on server endpoint patterns and on client username lookups,
respectively.

List elements should be separated by blanks and/or commas.

With the exception of NIS (YP) netgroup lookups, all access control
checks are case insensitive.


}
}
}

}
close LOGFILE;
close DENY;



John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order. -- Larry Wall
.



Relevant Pages

  • Hashtable with JDK1.5
    ... i have to compare the integer values based on the string as ... the for loop that i've used does not fully use the new JDK 1.5 foreach ... the 'foreach' loop in JDK1.5? ...
    (comp.lang.java.programmer)
  • Re: [PHP] foreach
    ... João Cândido de Souza Neto wrote: ... Inside foreach, could i know if i am in the last element of the array $numbers? ... maintain a count in the foreach and then compare to count..but why not just use a for loop? ...
    (php.general)
  • RE: Check DataGridView for existing row
    ... datagridview resides in a datatable: ... Another option would be to loop through the datagridview rows collection: ... I don't really want to do a foreach as there's about 100 + values to ... compare. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Anyone know how F# discriminated unions are implemented?
    ... seems to take around 300 nS per compare by my ... You must remember that F# was created by Haskell programmers in the aftermath of the failed Haskell.NET project at MSR. ... He doesn't like type classes either because of the abuse they suffer in Haskell due to the lack of more appropriate language features. ... idiomatic FP equivalant of a loop, it would seem that I cannot exploit ...
    (comp.lang.functional)
  • Re: Count Lines in (Huge) Text Files
    ... A few years ago, I was doing some high-throughput disk stuff and my recollection is that I found the same thing you did: larger buffers only helped up to about 8K or so, and past that any improvement was minimal. ... me that with appropriate settings for its buffer, it should perform better, since it ought to be optimized for line-based i/o. ... Assuming what's hurting you in the explicit forloop is the retrieval of the data and not the counter increment, the above should perform basically as well as a plain foreach() loop. ...
    (microsoft.public.dotnet.languages.csharp)