Re: :Oracle unicode problem

From: Christian Merz (christian.merz_at_muenchen.de)
Date: 11/04/04


To: "prashant shelar" <prashantshelar@hotmail.com>, <cassidy@systransoft.com>, <dbi-users@perl.org>
Date: Thu, 4 Nov 2004 07:23:29 +0100

Hi,
a few days ago I got a note dealing with character set problems (with German
umlauts).
(I didn't counter check the info)

In a Windows environment you have to cope with at least 4 character sets.
1. internally WinNT uses UCS-2 (2 bytes) Win2000 uses UTF16 (2 or 4 bytes)
2. graphical apps may use something like ISO Latin 1
3. a German DOS prompt uses we8pc850
4. your HTML browser may use UTF8

This also affects database accesses.
You have to consider your NLS_LANG registry entry, for example
AMERICAN_AMERICA.WE8ISO8859P1 or
GERMAN_GERMANY.WE8ISO8859P1
(see: v$nls_valid_values, v$nls_parameters, nls_XXXparameters, where XXX in
(database,instance,session))
If you use graphical apps (SQLPLUSW.exe, notepad, ...) you may get the correct
output, according to your NLS_LANG

On your DOS prompt you may have to overwrite the registry default:
C:\> set NLS_LANG=american_america.we8pc850
to process correctly your German umlauts in an SQL script.

On Unix you also may have to set your NLS_LANG environment wariable (and maybe
ORA_NLS...)

cu,
---------------------------------------------------------
Landeshauptstadt München
Direktorium - AFID 3.3 - Oracle DBA
C.A. Merz

----- Original Message -----
From: "prashant shelar" <prashantshelar@hotmail.com>
To: <cassidy@systransoft.com>; <dbi-users@perl.org>
Sent: Wednesday, November 03, 2004 5:33 AM
Subject: RE: :Oracle unicode problem

> Hi,
>
> I already set NLS_LANG=.UTF8 in .bash_profile.
>
> The data I am reading from $file which is in UTF-8 format.
> If I do not use encode() function then the data inserted in the database is
> "inverted question mark" for any japanese character.
>
>
> If I ran the code without encode() function on windows 2000 with Active Perl
> 5.8.4.810 installed then I am not having any problem. But the same code is
> not running on sun solaris box. It is inserting "Inverted question mark in
> Database"
>
> Is DBD::Oracle uses any OS level functions ? My solaris version is 2.8
>
>
> Thanks,
> Prashant Shelar
>
>
>
> From: "Susan Cassidy" <cassidy@systransoft.com>
> To: "'prashant shelar'" <prashantshelar@hotmail.com>
> Subject: RE: :Oracle unicode problem
> Date: Tue, 2 Nov 2004 08:59:58 -0800
>
> Are you setting the environment variable NLS_LANG to something with UTF8 in
> it? E.g.: .UTF8
>
> Also, if your data is already UTF8, you may be encoding it again and causing
> a problem.
>
> Susan Cassidy
>
> > -----Original Message-----
> > From: prashant shelar [mailto:prashantshelar@hotmail.com]
> > Sent: Tuesday, November 02, 2004 7:38 AM
> > To: dbi-users@perl.org
> > Subject: DBD::Oracle unicode problem
> >
> >
> > Hi ,
> >
> > I am having Oracle 9i release2 database with following details. All
> > database String columns are VARCHAR2
> >
> > database_charset =AL32UTF8
> > national_charset =AL16UTF16
> > The database server and perl scripts are running on Sun Solaris 2.8
> > box.
> > Perl Version = 5.8.5
> > DBI version = 1.45
> > DBD::Oracle version = 1.16 (I downloaded from
> >
> > http://homepage.eircom.net/~timbunce/DBD-Oracle-1.16-rc7-20040826.tar.gz)
> >
> > I am inserting the Japanese characters using Perl script running on
> > same
> > DB box. When I retrive the same data using SQLPLUS spool command the many
> > characters are not matching. Only some characters are matching.
> >
> > For e.g. If I having given U+3044 character to insert in database. If
> > I
> > spool if I am getting the
> > character U+307F.
> >
> > Any Help is greatly Appretiated.
> >
> > The code snippet is as below.
> >
> > my $dbh = DBI->connect("dbi:Oracle:$sid", $uid, $pwd, {
> RaiseError=>1,
> > AutoCommit=>1 });
> >
> > open (INFILE, $file) ;
> > while (my $line = <INFILE>) {
> >
> > my @data = split(/\|/, $line);
> > my $sql = "INSERT INTO table (col1, col2) values
> > (encode("utf8",$data[1]), encod("utf8",$data[2]) );
> >
> > $dbh->do($sql);
> > }
> >
> > $dbh->disconnect;
> >
> > _________________________________________________________________
> > The all-new MSN Mesenger! Get the coolest emoticons.
> > http://server1.msn.co.in/sp04/messengerchat/ Share more of yourself!
>
> _________________________________________________________________
> Send Money to India! Get a Mithai box.
> http://creative.mediaturf.net/creatives/icicibank/TOL.htm Win a FREE holiday
> in Goa.
>



Relevant Pages

  • Re: Pompeii - First Thoughts with Spoilers
    ... Water pistol ... I viewed the story today and thought the modern slang ... but having all your period characters ... enough time to the environment, and the quick-zoom direction was off- ...
    (rec.arts.drwho)
  • Re: Text environment
    ... | standard environment - but the problem is not this but distinguishing ... Control characters are not valid in variable names. ... The parser has to keep track of the current context; ... In most other command processor languages the value of a variable does not ...
    (comp.os.msdos.4dos)
  • Re: International Characters in a merge field
    ... typically use ODBC to open the file if ODBC is set up and the text ... There may be one or two other settings but I don't remember anything ... Are there any settings in word that involve importing or special characters ... OEM character set) ...
    (microsoft.public.word.mailmerge.fields)
  • Re: problem with java, ASCII and Linux
    ... you have a problem with non-ASCII characters. ... ASCII is US-ASCII, ... Appears that you have some partially utf-8 -based environment. ... to your Java VM which character encoding is used by your terminal ...
    (comp.infosystems.www.servers.unix)
  • Re: Enhanced Unicode support for "Go" tools
    ... Right, you know ASCII? ... accent characters used in French and other European ... UNICODE isn't just about all the different alphabets out ... out wrongly because the character set the file was written in is ...
    (alt.lang.asm)