Re: transforming german characters

From: John W. Krahn (someone_at_example.com)
Date: 08/06/04


Date: Fri, 06 Aug 2004 19:20:13 GMT

steve_f wrote:
> I want to transform special German characters to obtain the following
> variations:
>
> groß bräu
> gross bräu
> gross braeu
>
> there are two sets -
>
> set one:
> ß = ss = \xDF
>
> set two:
> Ä = Ae = \xC4
> Ö = Oe = \xD6
> Ü = Ue = \xDC
> ä = ae = \xE4
> ö = oe = \xF6
> ü = ue = \xFC
>
> basically, the rules are transform ß independently
> and with set two, they are either all on or off together.
>
> I wrote the follow which works well, but looks
> pretty bad I think.

It doesn't look too bad, I've seen worse. :-)

> so again this is a style question...
> can anyone suggest a cleaner approach? TIA

The usual idiom is to use a hash for the search and replace tables.

> sub transform_characters {
> my @input = @_;
> my @output;
> for my $string (@input) {
> push @output, $string;
> if ($string =~ /\xDF/) {
> $string =~ s/\xDF/ss/g;

Using a match followed by a substitution is a usual beginner mistake.
You only need the substitution.

          if ( $string =~ s/\xDF/ss/g ) {

> push @output, $string;
> if (test_for_character($string)) {
> $string = swap_all($string);
> push @output, $string;
> }
> next;
> }
> if (test_for_character($string)) {
> $string = swap_all($string);
> push @output, $string;
> }
> }
> return @output;
> }
>
> [snip code]

Using a hash you could write that as:

my %set1 = (
     "\xDF" => 'ss',
     );
# Use a character class because all keys are single characters
# If keys are multiple characters use alternation instead
my $key1 = '[' . join( '', keys %set1 ) . ']';

my %set2 = (
     "\xC4" => 'Ae',
     "\xD6" => 'Oe',
     "\xDC" => 'Ue',
     "\xE4" => 'ae',
     "\xF6" => 'oe',
     "\xFC" => 'ue',
     );
my $key2 = '[' . join( '', keys %set2 ) . ']';

sub transform_characters {
     my @input = @_;
     my @output;
     for my $string ( @input ) {
         push @output, $string;
         if ( $string =~ s/($key1)/$set1{$1}/og ) {
             push @output, $string;
             if ( $string =~ s/($key2)/$set2{$1}/og ) {
                 push @output, $string;
                 }
             next;
             }
         if ( $string =~ s/($key2)/$set2{$1}/og ) {
             push @output, $string;
             }
         }
     return @output;
     }

John

-- 
use Perl;
program
fulfillment


Relevant Pages