Re: Is there a dedicated unicode separator character?



In article <Xns97D46A8FB1BFCa@xxxxxxxxxxxxx>, Karla <LKarla@xxxxxxx>
wrote:

U+2028 is a line separator
U+2029 is a paragraph separator

Is there a unicode character that means nothing but "field separator"?

I'd like to create a ? separated file that will have the least possible
chance of containing data that accidentaly has the field separator in it.

-Karla

I'm no Unicode expert, so I don't know what those characters are
supposed to mean. But the concept of a "field separator" has to rely on
what constitutes a "field" in some context.

When the context is a "command line", the separator between each "field"
or argument is traditionally white space -- one or more characters. In
the Unix awk environment where I once did tons of scripting, that same
default occurred, but it could be changed readily, which made it easy to
handle delimited data -- such as the so-called CSV or comma-separated
values format.

So...your context determines what a "field separator" will be.

= Steve =
--
Steve W. Jackson
Montgomery, Alabama
.



Relevant Pages

  • Re: Unicode Support
    ... it is intended that no UNICODE character will ever go ... | same as it would be in ASCII: ... All non-ASCII characters use a multi-byte sequence ...
    (alt.lang.asm)
  • Re: How to display "8" correctly in German Os or Regional Setting with German?
    ... When I change my regional language form English to German, ... if you want to use Unicode character (in this ... fonts don't have full Unicode range of characters. ...
    (microsoft.public.vc.language)
  • Re: one interview question, 17 lines in java, 3 lines in ruby.
    ... But believe me, on my disk a copy of my message is stored exactly as was transmitted, that is in bytes representing a source characters you have received later. ... Similar is the original source file of a published piece of code, which size is exactly 177 bytes. ... can be used to include any Unicode character using only ASCII characters. ... Translation into sequence of input tokens begins just after that translation. ...
    (comp.lang.java.programmer)
  • Re: wstring to ostream
    ... There are different encodings for Unicode characters; UTF-8 and UTF-16 ... a Unicode character can be stored in one or two ...
    (microsoft.public.vc.stl)
  • Re: euro sign become ? on xml document parsing
    ... >>> just one way of encoding a subset of the Unicode characters. ... I think UFT-8 does encode all the Unicode character set. ...
    (comp.lang.java.help)