Re: CGI.pm: encoding problems
- From: "Mumia W." <mumia.w.18.spam+nospam.usenet@xxxxxxxxxxxxx>
- Date: Fri, 09 Jun 2006 22:00:31 GMT
Ben Bullock wrote:
I have a problem with inputing utf-8 via a text window using CGI.pm. This problem concerns UTF8 so apologies for posting something with Chinese characters in it.
The following code is a minimal working example of the problem with a lot of extraneous material removed. It needs to be run under a web server to see the problem. When the text is submitted using the form, the default text of Chinese characters (they are the numbers from one to four) are munged into some gibberish stuff, and the test of the input, which checks whether the input is valid Chinese numerals, fails:
Input text:
一二三四
Output of program:
Input ä¸äºä¸å was not a valid number
Thank you very much for any assistance, suggestions or advice about this problem.
Begin script (to end of message)
#!/usr/bin/perl
use warnings;
use strict;
use CGI;
use utf8;
binmode (STDOUT, ":utf8");
my $query = CGI->new();
$query->charset('UTF-8');
print $query->header();
my $kanji;
if ($query->param('kanji')) {
my $inputnumber = $query->param('kanji');
if ($inputnumber =~ /^([一二三四五六七八九十]+)$/) {
$kanji = $1;
} else {
print "<p>Input $inputnumber was not a valid number</p>";
$kanji = "";
}
} else {
$kanji = "一二三四";
}
print $query->start_form(-method => 'POST',-action => $query->url());
print $query->textarea(-name => 'kanji',
-default => $kanji);
print $query->submit();
print $query->endform();
print "<table><tr>\n<th>Value</th><td>",
$kanji, "</td></tr>\n", "</table>\n</form>\n<p>\n";
print $query->end_html();
I made a few changes to your program. I don't know exactly what the problem is, but I hope that this sheds some light on it:
#!/usr/bin/perl
use warnings;
use strict;
use CGI;
use utf8;
use Encode (); # changed
binmode (STDOUT, ":utf8");
my $query = CGI->new();
$query->charset('UTF-8');
print $query->header('-cache-control' => 'no-cache'); # changed
my $kanji;
if ($query->param('kanji')) {
my $inputnumber = $query->param('kanji');
print <<EOF;
<p> Interesting decodings of
"$inputnumber" <br>
UTF-8: @{[ Encode::decode('utf8', $inputnumber) ]} <br>
</p>
<hr>
EOF
# Add this to decode the number:
$inputnumber = Encode::decode('utf8', $inputnumber);
if ($inputnumber =~ /^([一二三四五六七八九十]+)$/) {
$kanji = $1;
} else {
print "<p>Input $inputnumber was not a valid number</p>";
$kanji = "";
}
} else {
$kanji = "一二三四";
}
print <<EOF;
<p> The value if \$kanji is: $kanji
</p>
EOF
print $query->start_form(
-method => 'POST',
-action => $query->url()
);
print $query->textarea(-name => 'kanji',
-default => $kanji);
print <<EOF;
<textarea name=alternate>
DATA = $kanji
</textarea>
EOF
print $query->submit();
print $query->endform();
print "<table><tr>\n<th>Value</th><td>",
$kanji, "</td></tr>\n", "</table>\n</form>\n<p>\n";
print $query->end_html();
.
- References:
- CGI.pm: encoding problems
- From: Ben Bullock
- CGI.pm: encoding problems
- Prev by Date: Re: CGI.pm: encoding problems
- Next by Date: XML::Twig
- Previous by thread: Re: CGI.pm: encoding problems
- Next by thread: Re: CGI.pm: encoding problems
- Index(es):
Relevant Pages
|