Re: Oracle ncahr versus nvarchar2 and unicode data



Please ignore the bind_param/execute part of my email - I was being an
idiot. Would still like opinions on the rest of it though.

Martin

Quoting "Martin J. Evans" <martin.evans@xxxxxxxxxxxx>:

> Hi,
>
> Wondered if anyone can tell me if I've misunderstood something here. I am
> able
> to insert unicode data into nvarchar2 and get back what was inserted but
> same
> code with nchar column does not work.
>
> OCI version is 10.1.0.2
> Database is Oracle 10.2 on Linux
> Database CHAR set is WE8ISO8859P1 (Non-Unicode), NCHAR set is AL16UTF16
> (Unicode)
> This is perl, v5.8.7 built for i686-linux
> DBD::Oracle 1.16.
>
> #!/disk2/martin/perl/bin/perl
> use charnames ':full';
> use DBI;
> use Encode;
> use DBD::Oracle qw( :ora_types ORA_OCI SQLCS_NCHAR );
> use Devel::Peek;
>
> if ( $] < 5.008) {
> print "Need at least Perl 5.8\n";
> exit 1;
> }
> binmode(STDOUT, ":utf8");
>
> print "<html><head>";
> print q|<meta http-equiv="Content-Type" content="text/html;charset=utf-8">|;
> print "<title>test</title></head><body>\n";
>
> $ENV{'ORACLE_HOME'} = '/home/oracle/product/10.1.0/db_2';
> $ENV{'NLS_NCHAR'} = 'AL32UTF8';
> my $dbh = DBI->connect('dbi:Oracle:tentwo.testing', 'xxx', 'yyy');
> $dbh->{ChopBlanks} = 1;
> print "OCI version is ", ORA_OCI, "\n";
> print "ora_can_unicode = ", $dbh->ora_can_unicode, "\n";
> if (!($dbh->ora_can_unicode & 1)) {
> print "Database does not support UNICODE\n";
> exit 1;
> }
> my $paramsH = $dbh->ora_nls_parameters();
> printf "Database $ora_server_version CHAR set is %s (%s), NCHAR set is %s
> (%s)\n",
> $paramsH->{NLS_CHARACTERSET},
> $dbh->ora_can_unicode & 2 ? "Unicode" : "Non-Unicode",
> $paramsH->{NLS_NCHAR_CHARACTERSET},
> $dbh->ora_can_unicode & 1 ? "Unicode" : "Non-Unicode";
> eval {
> $dbh->do('drop table martin');
> };
> # NOTE nchar does not work!!!! (why?)
> $dbh->do('create table martin (a nvarchar2(256))');
> my $sql = "insert into martin values (?)";
> my $x = "\N{EURO SIGN}\x{20ac}\N{LATIN CAPITAL LETTER A WITH ACUTE}";
> $x .= chr(0x20ac);
> print "is_utf8 returns ", utf8::is_utf8($x), "\n";
> print "valid returns ", utf8::valid($x), "\n";
> my $sth = $dbh->prepare($sql);
> $sth->bind_param(1, $x, {ora_csform => SQLCS_NCHAR});
> $sth->execute();
> $sql = "select a from martin";
> $sth = $dbh->prepare($sql);
> $sth->execute;
> while(my @row = $sth->fetchrow_array) {
> print "valid on returned string returns ", utf8::valid($row[0]), "\n";
> #$_ = decode("utf8", $_) foreach @row;
> print $row[0], "\n";
> print unpack("H*", $row[0]), "\n";
> print Dump(@row[0]);
> }
> $sth->finish;
> $dbh->disconnect();
>
> Output is:
>
> <html><head><meta http-equiv="Content-Type"
> content="text/html;charset=utf-8"><title>test</title></head><body>
> OCI version is 10.1.0.2
> ora_can_unicode = 1
> Database CHAR set is WE8ISO8859P1 (Non-Unicode), NCHAR set is AL16UTF16
> (Unicode)
> is_utf8 returns 1
> valid returns 1
> valid on returned string returns 1
> <utf8 data here - snipped from email>
> e282ace282acc381e282ac
> SV = PV(0x82a3efc) at 0x82e5b48
> REFCNT = 1
> FLAGS = (POK,pPOK,UTF8)
> PV = 0x8361bb0 "\342\202\254\342\202\254\303\201\342\202\254"\0 [UTF8
> "\x{20ac}\x{20ac}\x{c1}\x{20ac}"]
> CUR = 11
> LEN = 12
> </body></html>
>
> UTF-8 data output with Devel::Peek Dump looks fine.
>
> A simple change to create the table with nchar instead of nvarchar2
> outputs:
>
> <html><head><meta http-equiv="Content-Type"
> content="text/html;charset=utf-8"><title>test</title></head><body>
> OCI version is 10.1.0.2
> ora_can_unicode = 1
> Database CHAR set is WE8ISO8859P1 (Non-Unicode), NCHAR set is AL16UTF16
> (Unicode)
> is_utf8 returns 1
> valid returns 1
> valid on returned string returns 1
> <utf8 data here snipped from email>
> 3f3f413f
> SV = PV(0x82a3efc) at 0x82e5c3c
> REFCNT = 1
> FLAGS = (POK,pPOK,UTF8)
> PV = 0x8361cc0 "??A?"\0 [UTF8 "??A?"]
> CUR = 4
> LEN = 5
> </body></html>
>
> Devel::Peek Dump output looks nothing like what I expected.
>
> Also, do I have to use {ora_csform => SQLCS_NCHAR}? Does not seem to work in
> either case if I don't.
>
> And one final thing. If I change $x containing the UTF-8 data to insert and
> run
> $sth->execute again it inserts the original data again and not the changed
> $x.
> I've not confirmed this yet with other drivers, but I am /fairly/ sure with
> DBD::ODBC you don't have to call bind_param again if the bound data is
> changed.
>
> Thanks.
>
> Martin
> --
> Martin J. Evans
> Easysoft Ltd, UK
> Development
>
>
>



.



Relevant Pages

  • RE: Oracle ncahr versus nvarchar2 and unicode data
    ... nvarchar columns containing unicode data: ... 8bit Nchar test was broken ... So, my question now becomes, should I be able to store unicode data ... Martin J. Evans ...
    (perl.dbi.users)
  • RE: Oracle ncahr versus nvarchar2 and unicode data
    ... The insertion of UTF-8 data into ... from nchar fields that is not returning the data inserted. ... Martin J. Evans ... > to insert unicode data into nvarchar2 and get back what was inserted but same ...
    (perl.dbi.users)