Re: fconfigure -translation binary conversion



On Feb 28, 9:31 pm, Andreas Leitgeb <a...@xxxxxxxxxxxxxxxxxxxxxxxx>
wrote:
yahalom <yahal...@xxxxxxxxx> wrote:
I understand (or think I do :-)) but I am looking for a way out of
this mess. again my problem is that in tclhttpd code there is this
fconfigure. and I do not want to do in my code any utf-8 conversions
(unless this is the only way to work utf-8).

Now, *I* do not understand:
  does it work *with*  encoding convertto utf-8
  does it work *without* encoding convertto utf-8


if it was working I was not asking :-). the xml is proper for explorer
if I do the "endcoding convertto". but if in the xml I return, there
are result from the dabtabase they are not seen correctly (because I
did the conversion). also what I see in "puts stderr" (which helps
debugging) is not proper if I do the encoding. so in general either
way some things work and some fail. I want all too work smooth - xml
in a proper utf-8 format, result seen and debugging info seen...


I'd assume that one of them would work, so where is the
problem with taking whatever works?

PS: for further diagnosis, do the following:
  puts [string length $utf8Str]
   (right after you got the value from DB)
If the length corresponds to the number of (hebrew)
characters, then you actually have a tcl-internal formatted
string, and you need one step of conversion: either through
encoding, or by *not* setting the channel to binary.

If the reported length is somewhat larger, then you probably
already have a utf-8 encoded string, and you can throw it
directly to a binary channel.

thanks for this tip. I will try it. It sounds strange that string
length will be bigger number then the actual letters but if this is
the way it works....
sorry for hard pressing on this staff but all this encoding really
gets me crazy. it looks like nothing sticks, as though you have pieces
that needs different treatment and tcl with all it simplicity makes it
complex. in the beggining I thought utf-8 should be natuaral to tcl as
utf-8 is how tcl represent internally strings (at least that is what I
was told). but now I feel very shaky with all these conversions. as
though you never know what are your strings. it is like "what you see
is not what you get". maybe I am wrong in thinking that this should be
simple - tell tcl that the os works utf-8, write your code in utf-8
keep data in the database in utf-8 and all will be utf-8.
.



Relevant Pages

  • Re: [Re:] question about character encodings with Tcl interpreter embedded in C++
    ... If your input is UTF-8 you are all set ... On the Tcl level you have Unicode characters. ... If there is to be a NUL character in your binary data, ... > string I got from the outside world, ...
    (comp.lang.tcl)
  • Re: Interpretation of extensions different from Unix/Linux?
    ... the use of UTF-8 in this way is the recommendation of the ARG. ... (UTF-8 is a problem of its own in Ada. ... a UTF-8 encoded string is a String. ... You can't enumerate roots in Windows, ...
    (comp.lang.ada)
  • Re: Unicode Delphi Win32 - which approach
    ... I like the backwards compatibility aspects of UTF-8 vs UTF-16. ... The first 256 Unicode characters map to the ANSI character set. ... entire stream> but calling an API 100 times in a loop I can imagine. ... and explicitly contextualise every string. ...
    (borland.public.delphi.non-technical)
  • Re: counting substrings in a string
    ... didn't know if there might not be something internal to Tcl that did the job. ... Making [string first] and company use ... for KMP or BMcouldn't be kept in the internal rep). ... equally possible to make the generated automata recognize UTF-8 ...
    (comp.lang.tcl)
  • Re: UTF-8 encoding
    ... I need to pass a UTF-8 encoded writer ... reading that file with the system's default encoding. ... String), but used elsewhere as if it were a StringBuffer. ... There's a very good reason that ...
    (comp.lang.java.programmer)