Re: transforming german characters
From: John W. Krahn (someone_at_example.com)
Date: 08/06/04
- Next message: Gunnar Hjalmarsson: "Re: transforming german characters"
- Previous message: Scott W Gifford: "Re: Unbuffered piped command output?"
- In reply to: steve_f: "transforming german characters"
- Next in thread: steve_f: "Re: transforming german characters"
- Reply: steve_f: "Re: transforming german characters"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Fri, 06 Aug 2004 19:20:13 GMT
steve_f wrote:
> I want to transform special German characters to obtain the following
> variations:
>
> groß bräu
> gross bräu
> gross braeu
>
> there are two sets -
>
> set one:
> ß = ss = \xDF
>
> set two:
> Ä = Ae = \xC4
> Ö = Oe = \xD6
> Ü = Ue = \xDC
> ä = ae = \xE4
> ö = oe = \xF6
> ü = ue = \xFC
>
> basically, the rules are transform ß independently
> and with set two, they are either all on or off together.
>
> I wrote the follow which works well, but looks
> pretty bad I think.
It doesn't look too bad, I've seen worse. :-)
> so again this is a style question...
> can anyone suggest a cleaner approach? TIA
The usual idiom is to use a hash for the search and replace tables.
> sub transform_characters {
> my @input = @_;
> my @output;
> for my $string (@input) {
> push @output, $string;
> if ($string =~ /\xDF/) {
> $string =~ s/\xDF/ss/g;
Using a match followed by a substitution is a usual beginner mistake.
You only need the substitution.
if ( $string =~ s/\xDF/ss/g ) {
> push @output, $string;
> if (test_for_character($string)) {
> $string = swap_all($string);
> push @output, $string;
> }
> next;
> }
> if (test_for_character($string)) {
> $string = swap_all($string);
> push @output, $string;
> }
> }
> return @output;
> }
>
> [snip code]
Using a hash you could write that as:
my %set1 = (
"\xDF" => 'ss',
);
# Use a character class because all keys are single characters
# If keys are multiple characters use alternation instead
my $key1 = '[' . join( '', keys %set1 ) . ']';
my %set2 = (
"\xC4" => 'Ae',
"\xD6" => 'Oe',
"\xDC" => 'Ue',
"\xE4" => 'ae',
"\xF6" => 'oe',
"\xFC" => 'ue',
);
my $key2 = '[' . join( '', keys %set2 ) . ']';
sub transform_characters {
my @input = @_;
my @output;
for my $string ( @input ) {
push @output, $string;
if ( $string =~ s/($key1)/$set1{$1}/og ) {
push @output, $string;
if ( $string =~ s/($key2)/$set2{$1}/og ) {
push @output, $string;
}
next;
}
if ( $string =~ s/($key2)/$set2{$1}/og ) {
push @output, $string;
}
}
return @output;
}
John
-- use Perl; program fulfillment
- Next message: Gunnar Hjalmarsson: "Re: transforming german characters"
- Previous message: Scott W Gifford: "Re: Unbuffered piped command output?"
- In reply to: steve_f: "transforming german characters"
- Next in thread: steve_f: "Re: transforming german characters"
- Reply: steve_f: "Re: transforming german characters"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|