Re: Help with pattern matching

From: R. Joseph Newton (rjnewton_at_efn.org)
Date: 04/04/04


Date: Sun, 04 Apr 2004 13:52:48 -0700
To: A Lukaszewski <al@onetel.net.uk>

A Lukaszewski wrote:

> Greetings all,
>
> I have a comma-delimited lexical file with four fields: line number, the
> first part of a word, the second part of a word, and the word combined.
> The first and fourth fields are only for reference. The program I am
> developing is very simple. If field two and field three both have
> accents in them, then print the line to an output file.

That is noce, but where is the sample data? It is almost impossible to debug
input-processing code if you cannot see the material being processed. Some
errors are glaring suntax errors, but many involve a disconnect between the code
and the material being processed.

>
>
> The heavily-commented program is below. Thus far, all I get is an exact
> replica of the input file. In addition to a plain binding operator of
> '=~ //', I have also tried explicit matching (m//) and regex (qr//).
>
> #!/usr/bin/perl
>
> #############################################################
> #############################################################
> # A PROGRAM TO READ THE SUB-WORD HEADERS OF A #
> # COMMA-DELIMITED FILE #
> # AND DETERMINE WHICH LINES HAVE MULTIPLE ACCENTS #
> #############################################################
> #############################################################
>
> use strict;
>
> ###################################
> # OPEN THE INPUT AND OUTPUT FILES #
> ###################################
>
> my ($file, $outfile);
>
> $file = 'y.csv' ;
> # Name the input file
> $outfile = 'y.res';
> # Name the output file
> open(INFO, "$file" ) or die "Cannot open $file:$!\n";
> # Open the input file or report failure
> open(OUT, ">>$outfile") or die "Cannot open file y.res!\n";
> # Open the output file
>
> ########################################
> # INITIALIZATION OF SCALARS AND ARRAYS #
> ########################################
>
> my $line; # = scalar by which program steps through data
> my $fieldEval1; # = holding scalar for evaluating whether the
> # first half of the word has an accent in it
> my $fieldEval2; # = holding scalar for evaluating whether the
> # second half of the word has an accent in it
> my @field; # = holding array for the split line

Two ajor differences in convention between Perl and VB:

In Perl, CamelBack is generall reserved for package [aka class] names. Variables
should take $choo_choo_train style. Folowing this convention will make your code
much more understandable to other Perl programmers.

In Perl, there is no need to lump are your declarations at the top of a scope,
where their meaning must be expressed in comments. Put the life and meaning in
the code itself, by declaring meaningfully-named variables at the point closest
to their initial use. This makes the code much more understandable.

open WORD_PARTS, 'some_better_name_than_just_y.csv' or
 die "could not open parsed word file for input: $!";
open DOUBLE_ACCENTED_OUT, 'double_acented_words.res' or
die "Could not open results file for output: $!";

while (my $parsed_word = <WORD_PARTS>) {
   chomp $parsed_word;
   my ($line_id, $first_part, $second_part, $whole_word) = split /,\s*/,
$parsed_word;
   if (both_are_accented( $first_part, $second_part ) {
      print DOUBLE_ACCENTED_OUT, "$parsed_word";
   }
}

You will note that there are no comments in the code above. Do you have any
difficulty understanding what it does?

Joseph



Relevant Pages

  • Re: Perl script to search and replace a string in a file
    ... replaces the string in the input file with some other string and writes out ... the output file. ... just using the perl command: ...
    (perl.beginners)
  • Re: Help is needed to compile C program using Visual Studie 2005
    ... the pdb file that was used when this precompiled header was created, ... an output file whose name has the following format: ... The length of input file paths and name must be less than 256; ... while(i < DefinedVariableArrayIndex) { ...
    (microsoft.public.vc.language)
  • Re: Need advice on File I/O
    ... open the input file and open an output file, ... you would still have the input file unchanged. ... On all currently supported operating systems, ...
    (comp.soft-sys.matlab)
  • Re: Help with pattern matching
    ... then print the line to an output file. ... > replica of the input file. ... If you had had warnings enabled as well as strict you might have found ... > # Assign the second field to an evaluation scalar ...
    (perl.beginners)
  • Re: Difficult text file to parse.
    ... > records are which there are only two, look at the output file below to ... I want to show the delimiters even if ... > My sample Input file: ... [sample input and output files with long fields snipped] ...
    (comp.lang.perl.misc)