Re: transforming german characters
From: steve_f (me_at_example.com)
Date: 08/07/04
- Previous message: Charles DeRykus: "Re: partially matching a regexp"
- In reply to: Gunnar Hjalmarsson: "Re: transforming german characters"
- Next in thread: Joe Smith: "Re: transforming german characters"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Fri, 06 Aug 2004 20:13:29 -0400
Thanks Gunnar, some great stuff here....I can use simple
statements to just brute force things, but I know there is
a more elegent way.
On Fri, 06 Aug 2004 21:35:18 +0200, Gunnar Hjalmarsson <noreply@gunnar.cc> wrote:
>steve_f wrote:
>> I want to transform special German characters to obtain the
>> following variations:
>>
>> groß bräu
>> gross bräu
>> gross braeu
>>
>> there are two sets -
>>
>> set one:
>> ß = ss = \xDF
>>
>> set two:
>> Ä = Ae = \xC4
>> Ö = Oe = \xD6
>> Ü = Ue = \xDC
>> ä = ae = \xE4
>> ö = oe = \xF6
>> ü = ue = \xFC
>>
>> basically, the rules are transform ß independently
>> and with set two, they are either all on or off together.
>
>As John said, there is no reason to look for the characters with
>separate regexes, and accordingly there is no reason to distinguish
>between two sets.
The ß can either be on or off independent of the others so
you can get:
groß bräu
gross bräu
gross braeu
I should of stated the problem more directly:
if set one - set one on & set two on
set one off & set two on
set one off & set two off
if only set two - set two all on
- set two all off
>> for my $string (@input) {
>> push @output, $string;
>
>Here you copy the whole original text to @output ...
>
>> if ($string =~ /\xDF/) {
>> $string =~ s/\xDF/ss/g;
>> push @output, $string;
>
>... and here you *add* the converted string. In the suggestion below,
>I'm assuming that was a mistake.
>
> sub transform_characters {
> my @text = @_;
>
> my %replace = (
> "\xDF" => 'ss',
> "\xC4" => 'Ae',
> "\xD6" => 'Oe',
> "\xDC" => 'Ue',
> "\xE4" => 'ae',
> "\xF6" => 'oe',
> "\xFC" => 'ue',
> );
>
I really like the idea of the hash. Yes, I have heard you are not
thinking in Perl if you are not using hashes.
> for (@text) {
> s/(\xDF|\xC4|\xD6|\xDC|\xE4|\xF6|\xFC)/$replace{$1}/g;
> }
this is super! thanks
>
> @text
> }
>
> my @output = transform_characters(@input);
- Previous message: Charles DeRykus: "Re: partially matching a regexp"
- In reply to: Gunnar Hjalmarsson: "Re: transforming german characters"
- Next in thread: Joe Smith: "Re: transforming german characters"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|