Re: Is there a dedicated unicode separator character?




"Karla" <LKarla@xxxxxxx> wrote in message news:Xns97D46A8FB1BFCa@xxxxxxxxxxxxxxxx
U+2028 is a line separator
U+2029 is a paragraph separator

Is there a unicode character that means nothing but "field separator"?

I'd like to create a ? separated file that will have the least possible
chance of containing data that accidentaly has the field separator in it.

Depends on your data. For any character you can select, I can provide you with an infinite number of strings which will contain that character. For example, if you select character U+whatever, there's the string of length 1 which contaisn only "\uwhatever", there's the string of length 2 which contains "\uwhatever\uwhatever", the string of length 3, and so on.

If it were up to me, I'd probably use one of the "private areas". But again, you might eventually receive data containing that character, so you'll have to have some sort of escaping mechanism anyway.

- Oliver

.



Relevant Pages

  • [TOMOYO #15 3/8] Common functions for TOMOYO Linux.
    ... This file contains common functions (e.g. policy I/O, pattern matching). ... Since TOMOYO Linux is a name based access control, ... TOMOYO Linux's string manipulation functions make reviewers feel crazy, ... the Linux kernel accepts all characters but NUL character ...
    (Linux-Kernel)
  • RfD: Escaped Strings version 4
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... as an escape character for the entry of characters that cannot be ... \b BS (backspace, ASCII 8) ...
    (comp.lang.forth)
  • RfD: Escaped Strings version 4
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... as an escape character for the entry of characters that cannot be ... \b BS (backspace, ASCII 8) ...
    (comp.lang.forth)
  • Re: RfD: Escaped Strings
    ... the S" string can only contain printable characters, ... the S" string cannot contain the '"' character, ... \b BS (backspace, ASCII 8) ... \ ** escapes to characters much as C does. ...
    (comp.lang.forth)
  • Re: A note on computing thugs and coding bums
    ... code is valid for any character set that is legal in C (which is a ... characters in the required source character set ... A String, in C Sharp or Java, can be redefined. ... allow programmers to handle some other data format, ...
    (comp.programming)