Re: Perl script to clean up file -- Dont know if it can be done

From: Craig Ciquera (cciquera_at_mathworks.com)
Date: 09/22/04


Date: Wed, 22 Sep 2004 10:32:17 -0400

Would this work:

# Read in the datafile (assume name is datafile.txt)
open(DATAFILE,"<datafile.txt") or die "Cannot open datafile.txt";

while (defined ($line = <DATAFILE>) )
{
      # Skip everything but the info we are interested in
      next unless $line =~ /\d{2} \d{1} \d{2} \d{2}/;

      # Remove any leading 0's and/or whitespace
      $line =~ s/(^0{0,}\s{0,})(.*)/$2/;

      # Remove any potential empty fields
      $line =~ s/\s{2,}/,/g;

      print $line;
}

"LHradowy" <laura.hradowy@NOSPAM.mts.caaaaa> wrote in message
news:WWe4d.2141$IO.13667@news1.mts.net...
> I thought I would throw this out there, I think it can not be done, but I
am
> not a guru.
>
> This is the problem I get a file that I must pull the pertanent data out.
I
> has a header and footer, as well as page breaks, this is all in ASCII
> format. I need to pull out just the columns.
> I do this all manually (delete the header and footer, and well as all the
> page breaks) there are also at times a 0 at the beginning of a record that
I
> do not want there as well.
>
> This is what the file looks like...
>
>
> REPORT NAME: FACICL
> 0 CLIENT JOB NUMBER: 23405
> 0 CLIENT NAME: LAURA XXXXXXXX
> 0 CLIENT MAILING CODE: D509H
> 0 REPORT DATE: 04/07/08
> 0 REPORT TIME: 12:46
> 1REPORT NO: FACRPT14 SOME INFO HERE
> RUN DATE: 04JUL10
> 0 JOB NAME: FACCTICL CUTOVER ACCARE INTERFACE
> REJECT REPORT PAGE NO: 2
> 0 PROGRAM : FACB5500 CUTOVER: LEAF CUTOVER
> DATE: 04JUL09
> 0 TELN CUTTELN CUTOEN REJECT
REASON
> ---- ------- -------------- -----------

--
> -----------------
> 0      1555200                         00 0 12 02              CUSTOMER
HAS
>        2555206                         00 0 05 01              CUSTOMER
HAS
>        4555208                         00 0 03 06              TELN NOT
BILL
> 1REPORT NO:  FACRPT14                                 SOME DATA HERE
> RUN DATE: 04JUL10
> 0 JOB NAME:  FACCTICL                          CUTOVER ACCARE INTERFACE
> REJECT REPORT                                PAGE NO:       3
> 0 PROGRAM :  FACB5500                        CUTOVER:  LEAF        CUTOVER
> DATE: 04JUL09
> 0      TELN           CUTTELN          CUTOEN                  REJECT
REASON
>
>        ----           -------          --------------          -----------
--
> -----------------
> 0      1555200                         00 0 12 02              CUSTOMER
HAS
>        2555206                         00 0 05 01              CUSTOMER
HAS
>        4555208                         00 0 03 06              TELN NOT
BILL
> - REJECTED = 000000145 CUTOVER = 000000213
> -                                        *** SUCCESSFUL COMPLETION OF
> FACCTICL ***
>
>
> I manually disect this file to make it look like this...
>        1555002                         00 0 04 27              TELN NOT
BILL
>        3555007                         00 0 06 00              CUSTOMER
HAS
>        5555410                         00 0 12 10              CUSTOMER
HAS
>        6755012                         00 0 12 06              CUSTOMER
HAS
>
> I have manually removed the header, footer and page breaks. As well as
there
> always seems to be a 0 at start of the first record. I remove this as
well.
> I then run this perl script:
>
> while (<>) {
> chomp; # Will remove the leading , or new line
> s,^\s+,,; #Remove leading spaces
> my @cols=split m/\s{2,}/, $_, -1; # Split on two (or more) white space
> characters
> @cols == 2 and splice @cols, 1, 0, "";
> print join (',',@cols)."\n";
> }
>
> And I get this: WHAT I NEED!
> 5555002,00 0 04 27,TELN NOT BILL
> 1555007,00 0 06 00,CUSTOMER HAS
> 2555010,00 0 12 10,CUSTOMER HAS
>
> I want to try to eliminate as much manual intervention as I can.
>
>


Relevant Pages

  • Perl script to clean up file -- Dont know if it can be done
    ... This is the problem I get a file that I must pull the pertanent data out. ... I do this all manually (delete the header and footer, ... always seems to be a 0 at start of the first record. ...
    (comp.lang.perl.misc)
  • Re: Running Total .. wrong total
    ... record of the next page in the running total field). ... HOWEVER, now, the page header on page 2, which should show the value of the ... Then, in the page footer and page header, I have: ... But then added the first record total on ...
    (microsoft.public.vb.crystal)
  • Re: Running Total .. wrong total
    ... formula is used in my header AND footer. ... > shared numbervar runTotPC; ... But then added the first record total ...
    (microsoft.public.vb.crystal)
  • Re: Page Numbers in Cells
    ... Why not put the page number in the header or footer. ... Select the pull down menu for header or footer and ... > The top section is printed on each page and one of the cells needs to ...
    (microsoft.public.excel)
  • Re: unlink Footer/Header in VBA
    ... AutoText it will bring the section break with it. ... Header, a footer and Bibliography all in one. ... Bibliography" and another for "Footer for Bibliography". ...
    (microsoft.public.word.vba.general)