Re: simple wildcard regex problem.
- From: "Paul Lalli" <mritty@xxxxxxxxx>
- Date: 1 May 2006 08:48:37 -0700
Nikolas Britton wrote:
I'm sorry, but there is no excuse for a goto "loop", in any reasonably
high-level language. This is basic programming, and has nothing to do
with Perl.
Yes I know but I don't know what else to use.
In this case, a while loop. Starts where your label is, ends at the
end of the code, runs continuously (ie, the conditional is simply '1'):
while (1) {
#all code here
}
If you're looking for constructive criticisms of your code, by all
means, post away. But don't post again because you think you'll get a
different solution than I already offered - at the very least not
without trying that solution first.
Yes I am and I did try your solution but it didn't work.
"Did not work" is the worst of all possible error descriptions. Please
post your sample input, your expected output, and your actual output.
Here's the code:
Please work on minimizing the code you post to the smallest
short-but-complete script that demonstrates the actual error you're
having.
#!/usr/bin/perl
use strict; use warnings;
my $unsolvedword;
my $numberofletters;
my $wordlistfile1 = "/usr/share/dict/words";
my $wordlistfile2 = "enable2k_wordlist";
my $dict = $wordlistfile1;
my @guesswords;
my ($foo1, $foo2, $foo3);
my $solvedletters1;
my $solvedletters2;
my $regex;
my $regexstring;
Don't do this. Declare your variables in the smallest scope possible.
It increases readability and decreases the amount of time needed to
debug.
print "\n #######################################";
print "\n Before you start solve for the letter e";
I have no idea what that means. I have no idea what the intent of this
program actually is. These are good things to explain.
print "\n #######################################";
print "\n";
print "\n What's the word?, enter dashes ( - ) for unknown letters: ";
$unsolvedword = <STDIN>; chomp($unsolvedword); $numberofletters =
($unsolvedword =~ tr/a-z|-/a-z|-/);
1) Are you assuming the user will never enter capital letters?
2) Is | a legal character? If not, why is it there?
3) Do you only want to get a count of those characters (letters,
vertical bars, and dashes), and ignore everything else? I'm guessing
not.
die "Invalid format" unless $unsolvedword =~ /^[a-z-]$/i;
my $numberofletters = length $unsolvedword;
print " The word is: \"$unsolvedword\", it has $numberofletters
letters.\n Is this correct? hit n to bail out, otherwise just hit
enter: ";
$_ = <STDIN>; chomp($_);
Never explicitly assign to $_. Use a real variable name. $_ can
change or be read without you doing it explicitly, and is therefore
dangerous to use in any situation in which it is not the default.
chomp (my $response = <STDIN>);
if ($_ eq "n"){
print "\n Goodbye.\n";
exit;
};
$solvedletters1 = "e"; #comment this out to run bad code.
WORKLOOP1:
my $continue = 'y';
do {
$foo1 = "0";
Don't set a variable to a string if you're later using it as a number.
Don't use meaningless variable names.
my $letter_pos = 0;
#This block of code sets $regexstring:
if ($solvedletters1 eq "") {
while ($foo1 < $numberofletters) {
#bad code in here
$foo2 = substr($unsolvedword, $foo1, 1);
if ($foo2 eq "-"){
$regexstring .= "."; #<-- this here.
} else {
$regexstring .= $foo2;
}
$foo1++;
}
}
This entire block can (and should) be replaced with:
(my $regexstring = $unsolvedword) =~ tr/-/./;
else {
$solvedletters2 = "[^$solvedletters1]";
while ($foo1 < $numberofletters) {
$foo2 = substr($unsolvedword, $foo1, 1);
if ($foo2 eq "-"){
$regexstring .= $solvedletters2;
} else {
$regexstring .= $foo2;
}
$foo1++;
}
}
This entire block can (and should) be replaced with:
(my $regexstring = $unsolvedword) =~ s/-/[^$solvedletters1]/g;
#For debugging
print "\n<debug>\nRegex String: $regexstring\n</debug>\n";
print "\n Here are the results from your query:\n\n";
#Open wordlist file.
open(DATA, "< $dict")
or die "Couldn't open $dict for reading: $!\n";
DATA is a special file handle. Don't use it as a normal file handle.
Use the three argument form of open unless you have a really good
reason to use the two argument form.
Use lexical filehandles instead of global barewords
open my $dict_fh, '<', $dict or die "Couldn't open '$dict': $!\n";
$regex = qr{^$regexstring$};
my %freq; #keys = characters, values = frequency count.
while (<DATA>) {
chomp;
if (/$regex/) {
#Move matched words into a list for later use.
push @guesswords, "\n$_";
Don't intermingle output formatting with the data. Format your output
when you output.
push @guesswords, $_;
#Delete letters we already solved for.
$_ =~ s/[$solvedletters1]//g;
Now here is somewhere that $_ is used implicitly. So use it
implicitly.
s/[$solvedletters1]//g;
#Move letters into array for sorting.
my @chars = split("", $_);
$_ = ""; @chars = sort (@chars);
#Move (sorted) letters back into a string.
foreach (@chars) {
$foo3 .= $_;
}
$_ = $foo3; $foo3 = "";
ugh ugh ugh.
$_ = join(q{}, sort split(//, $_));
#Remove duplicate letters, so they don't get counted.
$_ =~ tr///cs;
#print "$_\n"; #For debugging.
#Move processed letters into a hash for counting.
$freq{$_}++ for split(//, lc $_);
}
}
#Matched words
$_ = scalar @guesswords;
Again. Stop using $_ for everything. It's a "default" variable. Let
it be the default, and choose real variable names when you have to
create one.
my $num_of_words = @guesswords;
print " Guesswords in list: $_\n";
if (@guesswords <= "20") {
Stop comparing strings-containing-numbers as numbers. Just compare
numbers.
if (@gueswords <= 20){
print "@guesswords\n\n";
Here's where you're outputting, so here's where you format the output:
print join("\n", @guesswords), "\n\n";
}
#Print probabilities list, sort by hash value in ascending order
foreach $_ (sort {$freq{$a} <=> $freq{$b}} (keys(%freq))) {
print " $_\t=>\t$freq{$_}\n";
}
#For debugging
print "\n<debug>\nRegex: $regex\n</debug>\n";
$regex = ""; $regexstring = ""; ($foo1, $foo2, $foo3) = ""; undef
%freq; undef @guesswords;
#Round two
print "\n Continue solving word? hit n to bail out, otherwise just hit
enter: "; $_ = <STDIN>; chomp($_);
chomp ($continue = <STDIN>);
if ($_ eq "n") {
print "\n Goodbye.\n";
exit;
}
print " What letter did you try? "; $solvedletters1 .= <STDIN>;
chomp($solvedletters1);
Gah. If you're going to put two statements on one line (which you
shouldn't), at least put the two that are associated with each other:
print "What letter did you try? ";
chomp ($solvedletters .= <STDIN>);
print " What's the word now?, enter dashes for unknown letters: ";
$unsolvedword = <STDIN>; chomp($unsolvedword);
goto WORKLOOP1 #I'm new at programming so don't kill me!
} until (lc $continue eq 'n');
Make all these corrections, state what your *actual* problem is, give
examples of input and output, and hopefully someone can help you figure
out what you're doing wrong.
.
- Follow-Ups:
- Re: simple wildcard regex problem.
- From: Paul Lalli
- Re: simple wildcard regex problem.
- Prev by Date: Re: [OT] I give up with the reply-to business already
- Next by Date: Re: simple wildcard regex problem.
- Previous by thread: Re: splitting on every other delimiter @ 1146493312
- Next by thread: Re: simple wildcard regex problem.
- Index(es):
Relevant Pages
|