Re: Need regexp to rejoin URL links broken by \n



On 22 Jun 2005 02:15:15 -0700,
Tony <hawkmoon1972@xxxxxxxxxxx> wrote:
>
> Can someone help me with a regular expression that removes \n's from
> the middle of URL's?

[snip]

> I'd like to pass this through a regular expression that removes all the
> \n's between http:\\ and the next dot followed by a space (that is:
> '. ')

While Tad's solution gives you that, it isn't going to be a solution
to your problem. The example text you showed can have URLs broken
without them following a space:

Here is another
URL: http://test.
com/hello?test=op
tion1&test2=optio
n2&test3=option3.
Thanks for reading.

Looking for ( |\n) following a full stop also won't work, as the first
full stop in that URL would signify the end of the URL. I can't really
think of a RE that would work in the generic case. You'd probably have
to build something that also validates that the URL is valid to get
closer.

Martien
--
|
Martien Verbruggen | Computers in the future may weigh no more
| than 1.5 tons. -- Popular Mechanics, 1949
|
.



Relevant Pages