Re: best method to perform operations on word lists
- From: "Bart Van der Donck" <bart@xxxxxxxxxx>
- Date: 12 Jun 2006 01:27:14 -0700
Francois Massion wrote:
[...]
#!perl
use strict; use warnings;
my $list =
"überzeugt
überzeugt,
überzogen
überzogen,
überzogen.
üblich
übliche
üblichen
üblicherweise";
my @terms = split /\n/, $list;
my $prev = 'nonesuch584685542256RANOM58544';
This didn't modify the list.
I didn't mean to modify $list; the new content is in @terms. If you
want $list to contain the new words, you can use something like this at
the end of the program.
$list = join "\n", @terms;
Maybe the reason is the $prev definition.
$prev has no direct importance here, it's only required that it should
not be present in @terms, because it is used to delete double entries
from @terms.
s/(\.|,|e|en|e,|en,|e\.|en\.)$// for @terms;I also tried Dr. Ruud's regex but it would have to be rewritten for
each language.
That is correct, hence my thoughts about language files. My code is a
very brute algorithm - it only strips out the following from the end of
each line:
. , e en e en, e. en.
If you are planning to use this for different languages, you would
obviously need to modify those patterns each time.
--
Bart
.
- Follow-Ups:
- Re: best method to perform operations on word lists
- From: Francois Massion
- Re: best method to perform operations on word lists
- References:
- best method to perform operations on word lists
- From: Francois Massion
- Re: best method to perform operations on word lists
- From: Dr.Ruud
- Re: best method to perform operations on word lists
- From: Francois Massion
- Re: best method to perform operations on word lists
- From: Bart Van der Donck
- Re: best method to perform operations on word lists
- From: Francois Massion
- Re: best method to perform operations on word lists
- From: Bart Van der Donck
- Re: best method to perform operations on word lists
- From: Francois Massion
- best method to perform operations on word lists
- Prev by Date: Re: Determining lvalue context
- Next by Date: Re: script to find the files with very long names
- Previous by thread: Re: best method to perform operations on word lists
- Next by thread: Re: best method to perform operations on word lists
- Index(es):
Relevant Pages
|