Encoding
- From: zeman@xxxxxxxxxxxxxxxx (Daniel Zeman)
- Date: Thu, 31 Aug 2006 12:08:21 -0400
Hi,
I wonder if anyone can help me.
I am using
- Debian Linux
- Perl 5.8.8
- DBI (I do not know how to figure out its version)
- MySQL 5.0.22-Debian_3-log
I want to store and handle data in UTF-8 but so far I have not been able to force Perl/DBI to do so.
I have created a table using
my $sql = "CREATE TABLE $tbl (".join(", ", @columns).") CHARACTER SET utf8 COLLATE utf8_czech_ci;";
$dbh->do($sql);
I feeded the table with data using
my $list_of_columns = join(", ", @names);
my $list_of_values = join(", ", map{"_utf8'$record->{$_}'"}(@names));
my $sql = "INSERT INTO $tbl ($list_of_columns) VALUES ($list_of_values);";
$dbh->do($sql);
I have looked into the database using phpMyAdmin 2.8.2-Debian-0.1 and it really looked like the data were stored in correct UTF-8.
However, when I retrieve the data from Perl/DBI, something in the chain (MySQL? the driver? DBI?) decides that another encoding (probably, Latin1) would be better for me. It "converts" the strings from UTF-8 to that encoding, which means, at the time the data arrives in my Perl code, all the non-Latin1 characters have already been irrecoverably converted to question marks. I would be happy to decode the data myself but there is nothing I can do with the question marks.
I am using the following code to retrieve the data:
my $sql = "SELECT kod, hry.nazev FROM hry INNER JOIN prodej ON hry.kod = prodej.kod_hry GROUP BY kod, hry.nazev";
my $sqlobj = $dbh->prepare($sql);
$sqlobj->execute();
while(my ($kod, $nazev) = $sqlobj->fetchrow_array())
{
...
}
So far, the only workaround I have, is not to tell the DBI the data is UTF-8 when I am inserting it (i.e., drop the "_utf8" part before the single quote), and use Encode; decode("utf8", ...) on anything I fetch from the database. This way, the database never knows the data was a UTF-8 text, treats the bytes as Latin1 characters and returns them undisturbed. However, I cannot access the data using phpMyAdmin (unless I en/decode UTF in my brain), the string lengths do not reflect the reality etc.
Is there a better way to do it? I think there must be some small stupid locale-like setting telling the machine that I am a UTF guy. But the settings I was able to come up with did not help and I actually have no idea which part of the MySQL-driver-DBI-Perl chain is responsible.
Any hints are welcome.
Thanks
Dan
.
- Prev by Date: DBI::st=HASH(0x88f6520)
- Next by Date: Re: Copy one installation to another server - not working
- Previous by thread: DBI::st=HASH(0x88f6520)
- Index(es):
Relevant Pages
|