Re: Encoding conversion problem



Hi everyone,
sorry for my previous double-post (a mistake).

Is is possible to ask the database driver to do the conversions for
you? Perhaps internally it is Unicode or some other encoding that can
deal with Euros.
I've checked the properties of the JDBC driver I use (http://
publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/
com.ibm.db2.udb.doc/ad/rjvdsprp.htm) but there's nothing concerning
encoding conversions.

We have the clue that C++ programs seem to store euro s and get them back out.
Yes we have C and COBOL programs that can store and write non-IBM850
chars without problems too.
As pointed out by Sabine in her post the reason may be that C programs
work with the pure sequences of bytes, without performing any encoding
conversion.

I do not really understand why a Euro sign would work with 8859-1 since
that does not contain that character as far as I am aware of.

SORRY SORRY SORRY SORRY SORRY
I tried to insert (through JDBC) the EURO character in a DB2
configured with
....
Database territory = C
Database code page = 819
Database code set = ISO8859-1
....
and I can't neither write nor read in Java the EURO character
correctly :-(
A COBOL program works instead correctly.

Then I tried the same thing on a SQL-Server 2000 instance with
collation compatibility_51_409_30003 (correponding to a 1252 codepage,
i.e. Latin 1) and I can store and read the EURO character via
Java&JDBC.

That doesn't work in Java with Oracle 10g configured with
....
NLS_LANGUAGE = AMERICAN
NLS_TERRITORY = AMERICA
NLS_CHARACTERSET = US7ASCII
NLS_LENGTH_SEMANTICS = BYTE
....
store&read through COBOL is ok, and in Java I can even write&read
accented vowels... even if those characters are outside USASCII7...

You could do an experiment. Try feeding your database all possible
unicode chars in a set of 1-char records, and see which ones come back
unmangled. This is a kludge, but you could preconvert your Euro to
one of those invariant unused chars.
The EURO character is just an example and part of the problem, I can't
use this type of kludges.
The specific problem is much more complex: a password is crypted and
stored to DB with a C program but the crypted chars fall outside
IBM850 range and in Java I'm unable to read and decrypt back the
string... this works if the database is ISO-8859-1 (that's why I
though I were able to write another 'weird' char, the euro char, on an
ISO-8859-1 DB, sorry...). I've also the more general problem of data
entry: I don't know wich characters users will insert so I can't
substitute chars.
I've found a workaround for my crypting problem but I'm just trying to
understand the reason of the problem.

Now it's clear to me that with a CHAR field Java performs an encoding
conversion using the encodings of the JVM and of the DBMS: if some
characters fall outside the destination encoding then they are lost
(i.e. converted in something completely different).
The only 'mysterious' thing for me now is the behavior on Oracle (JDBC
can read&write accented vowels even if they are outside ascii7)... any
idea? Maybe the Oracle driver is smarter than the DB2 Universal
Driver...

Thanks everyone,
Andrea
.



Relevant Pages

  • Re: Currencies and Euros
    ... David Horne, _the_ chancellor of the duchy of besses o' th' barn and prestwich tesco 24h offy wrote: ... character set which does not include the euro symbol. ... my newsreader perpetrates for one. ... That is, the correct encoding. ...
    (rec.travel.europe)
  • Re: Encoding conversion problem
    ... Perhaps internally it is Unicode or some other encoding that can ... I tried to insert the EURO character in a DB2 ... Database territory = C ... even if those characters are outside USASCII7... ...
    (comp.lang.java.databases)
  • Re: RfD: Escaped Strings version 4
    ... In my system \u20AC (the euro sign) will insert ... UTF-8 encoding or char number? ... UTF-32 little or big-endian? ... It should be the Unicode character number in any case, ...
    (comp.lang.forth)
  • Re: Currencies and Euros
    ... That is, the correct encoding. ... the 7-bit encoding there's no euro sign, so it has been replaced by the ... Set the default character encodings for outgoing and recieving mail, ... the newsreader ignores the encoding and produces? ...
    (rec.travel.europe)
  • Re: Webserver on Fedora: Special characters like ë + ö wont display.
    ... >> having access to a font with a euro character in them. ... For that reason ... Over here, the Euro isn't used, and I think you'd be hard ... Yes, I can look it up, but it's not a currency that we're familiar ...
    (alt.os.linux.redhat)