Regexp issue . . .
From: MichaelC (mickyc_at_NOshaSPAMw.ca)
Date: 11/25/03
- Previous message: Christian Eriksson: "Problem using LD_LIBRARY_PATH in perl script"
- Next in thread: Eric J. Roode: "Re: Regexp issue . . ."
- Reply: Eric J. Roode: "Re: Regexp issue . . ."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Tue, 25 Nov 2003 07:03:37 GMT
Hi all. I am having a particularly difficult time with a perl script that I
am writing. The problem area is a place where I need to strip some newlines
out of a file.
My source data is text which is in paragraph form, but has line breaks
within the paragraphs. I need to do as much processing as possible in order
to minimise the amount of manual changes that I have to make.
Sample text is as follows:
"This document is intended to give you an
overview of DG as well as highlight some of
the features. This is a brought to your handheld using DG."
With DG you can view and edit word processing and spread*** files on
your handheld. Simple push-button synchronization of
the handheld with the desktop will maintain the most up-to-date
version of a file on both the desktop and handheld.
I want these to be parsed as follows:
"This document is intended to give you an overview of DG as well as
highlight some of the features. This is a brought to your handheld using
DG." With DG you can view and edit word processing and spread*** files on
your handheld. Simple push-button synchronization of the handheld with the
desktop will maintain the most up-to-date version of a file on both the
desktop and handheld.
--
One way that I thought might work is to catch all lines that begin upper
case, prepend them with a line break, strip the trailing break, then trap
all lines that start lower case and dump them as-is. Repeat this until no
matches are made on the lower case test, then clean up all those extra line
breaks.
I came up with this . . . but all it seems to do is strip all newlines out.
while( <infl> ) {
my $x = $_;
if ( $x =~ ?^[^a-z]? ) { $x =~ s!(.*)\n!\n\1 ! }
else { $x =~ s!(.*)\n!\1 ! }
print outfl $x;
}
Any help would be greately appreciated.
Michael
- Previous message: Christian Eriksson: "Problem using LD_LIBRARY_PATH in perl script"
- Next in thread: Eric J. Roode: "Re: Regexp issue . . ."
- Reply: Eric J. Roode: "Re: Regexp issue . . ."
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]