Re: Encoding conversion problem
- From: Sabine Dinis Blochberger <no.spam@xxxxxxxxxxxx>
- Date: Tue, 12 Feb 2008 13:02:17 GMT
Andrea wrote:
Yes, in theory you can store any value (0 - 255 in case of one byteI've found a strange thing: C and COBOL application can write and read...
If I save characters outside the range supported by IBM-850 (i.e. the
euro currency character EURO) then I read garbage...
Yes, the Euro symbol is not part of the encodings, so your database
can't contain it.
(using embedded SQL) characters outside the accepted range without
problems... So the database can contain those characters without
loosing any information, but I can't understand how...
strings) in a string, but how that is interpreted (i.e. encoding) is
where it gets hairy. Also, multibyte characters would break the
interpretation.
Ah, there is always hacks around limitations. But they aren't usuallyIf you need it, you would have to change the databasesActually the EURO character is just an example, I have more complex
encoding (ISO-8859-15 includes the Euro symbol).
Otherwise, you have to take care not to try to write unsupported
character into string/character fields.
One solution could be to parse all strings and replace the symbol with
the shorthand "EUR", but it might not be acceptable to your client.
strings to handle (and I can't change the encoding of the database).
If my problem has no solution at all then I'd like to understand why
other languages don't have this problem...
pretty. The problem is to funnel a string with these "unsupported"
characters through the JDBC driver (both ways).
You might get around it by using typeless fields (you can put any byte
sequence there), like BLOBS maybe...
Or you write a parser that substitutes the impossible characters with
acceptable replacements. Of course, this is most likele not feasable.
But the customer has to be aware that a database with encoding X can
only hold strings encoded in X. If they need UTF-8 for example now, they
will eventually have to change their database. And it would be better to
migrate to a suitable encoding than to hack around it and in a few
years, have to do all over again (and then some), when they finally do
want to change the database encoding.
On other languages not having the problem, in C, you can treat a string
just like an array of bytes and use those for whatever you like, the
compiler won't complain. Even interpreting them as memory addresses is
possible, adding and subtracting etc...
Thanks,
Andrea
--
Sabine Dinis Blochberger
Op3racional
www.op3racional.eu
.
- Follow-Ups:
- Re: Encoding conversion problem
- From: Andrea
- Re: Encoding conversion problem
- References:
- Encoding conversion problem
- From: Andrea
- Re: Encoding conversion problem
- From: Sabine Dinis Blochberger
- Re: Encoding conversion problem
- From: Andrea
- Encoding conversion problem
- Prev by Date: Re: Encoding conversion problem
- Next by Date: Re: Encoding conversion problem
- Previous by thread: Re: Encoding conversion problem
- Next by thread: Re: Encoding conversion problem
- Index(es):
Relevant Pages
|
|