Re: transforming german characters

From: steve_f (me_at_example.com)
Date: 08/07/04

  • Next message: steve_f: "Re: transforming german characters"
    Date: Fri, 06 Aug 2004 20:13:29 -0400
    
    

    Thanks Gunnar, some great stuff here....I can use simple
    statements to just brute force things, but I know there is
    a more elegent way.

    On Fri, 06 Aug 2004 21:35:18 +0200, Gunnar Hjalmarsson <noreply@gunnar.cc> wrote:

    >steve_f wrote:
    >> I want to transform special German characters to obtain the
    >> following variations:
    >>
    >> groß bräu
    >> gross bräu
    >> gross braeu
    >>
    >> there are two sets -
    >>
    >> set one:
    >> ß = ss = \xDF
    >>
    >> set two:
    >> Ä = Ae = \xC4
    >> Ö = Oe = \xD6
    >> Ü = Ue = \xDC
    >> ä = ae = \xE4
    >> ö = oe = \xF6
    >> ü = ue = \xFC
    >>
    >> basically, the rules are transform ß independently
    >> and with set two, they are either all on or off together.
    >
    >As John said, there is no reason to look for the characters with
    >separate regexes, and accordingly there is no reason to distinguish
    >between two sets.

    The ß can either be on or off independent of the others so
    you can get:

    groß bräu
    gross bräu
    gross braeu

    I should of stated the problem more directly:

    if set one - set one on & set two on
                     set one off & set two on
                     set one off & set two off

    if only set two - set two all on
                          - set two all off

    >> for my $string (@input) {
    >> push @output, $string;
    >
    >Here you copy the whole original text to @output ...
    >
    >> if ($string =~ /\xDF/) {
    >> $string =~ s/\xDF/ss/g;
    >> push @output, $string;
    >
    >... and here you *add* the converted string. In the suggestion below,
    >I'm assuming that was a mistake.
    >
    > sub transform_characters {
    > my @text = @_;
    >
    > my %replace = (
    > "\xDF" => 'ss',
    > "\xC4" => 'Ae',
    > "\xD6" => 'Oe',
    > "\xDC" => 'Ue',
    > "\xE4" => 'ae',
    > "\xF6" => 'oe',
    > "\xFC" => 'ue',
    > );
    >
    I really like the idea of the hash. Yes, I have heard you are not
    thinking in Perl if you are not using hashes.

    > for (@text) {
    > s/(\xDF|\xC4|\xD6|\xDC|\xE4|\xF6|\xFC)/$replace{$1}/g;
    > }
    this is super! thanks
    >
    > @text
    > }
    >
    > my @output = transform_characters(@input);


  • Next message: steve_f: "Re: transforming german characters"

    Relevant Pages