Re: Does strtok require a non-null token?



William Hughes wrote:
Default User wrote:
William Hughes wrote:

ryampolsky@xxxxxxxxx wrote:
I'm using strtok to break apart a colon-delimited string. It
basically works, but it looks like strtok skips over empty
sections. In other words, if the string has 2 colons in a row, it
doesn't treat that as a null token, it just treats the 2 colons as
a single delimiter.

Is that the intended behavior?
Yes. Just one more reason to avoid strtok().
Unless that's the behavior you want. Example, breaking lines into words
with white space. You don't want a bunch of "null" words.

The point is not that the function's behaviour is not sometimes
what you want. The point is

-the default behaviour is surprising
Perhaps. About the only thing surprising to me was that the argument you pass it is affected.

-the default behaviour is not even
usually what you want
So far I've only ever needed the default behaviour with respect to collapsing adjacent tokens. In fact, I *expected* this! That is, for the majority of the reasons I need to tokenized a string, this default behaviour is exactly what I want.

-the default behaviour throws information away

Not sure what you mean here, but I assume you are referring to how it munges its argument. I guess I just never care about this because we always store strings in a struct that is passed around, or make copies of things we tokenized and care about.

-if you don't like the default behaviour, see
figure 1.

I assume figure 1 is a picture of your own implementation that has non-default requirements :)

Personally I'm with the Linux man pages on this one. Under Bugs
is the advice "Never use this function".

Well, I'll ignore this advice. For the trivial case of needing tokenized a string to store in my own array of buffers, it works just fine.

For those requirements that strtok() does not fit we have our own internal tokenizing routines. If all I need is to parse out (say) a bunch of email addresses passed as a list and store them in a char** [which was the last time I used strtok()] then it fits perfectly. In this case I don't even care if the calling code screwed up the list. I either get one or more valid strings or I don't. I return success or failure and let them howl!

Of course, if I'd been bitten by the function in the past, I'd be arguing differently.

Many of the str_ routines in the Standard have some legacy use that explains design decisions [e.g., strncpy() and database column width]. I wonder if strtok() also has history that explains why the defaults cause so much consternation?
.



Relevant Pages

  • Re: strtok ( ) help
    ... > splitCommandssomehow modifying the pointer, but I HAVE to call that ... Here's an idea of how to use the strtok() function. ... don't mind trashing the contents of a string s, ... will give you a loop that extracts the tokens one at a time from s. ...
    (comp.lang.c)
  • Re: Does strtok require a non-null token?
    ... but it looks like strtok skips over empty sections. ... it just treats the 2 colons as a single delimiter. ... delimiter set contains white space (for dividing a string ...
    (comp.lang.c)
  • Re: Does strtok require a non-null token?
    ... In other words, if the string has 2 colons in a row, it ... Just one more reason to avoid strtok(). ... tokenized a string to store in my own array of buffers, ... But I also think that once you understand the limitations and caveats that go along with it, there is no reason not to use it for those cases where it is a good fit. ...
    (comp.lang.c)
  • Re: Does strtok require a non-null token?
    ... it just treats the 2 colons as a single delimiter. ... This is one of the drawbacks of strtok. ... character that *is* in the delimiter set and sets it to null. ... it's probably easier to scan the string ...
    (comp.lang.c)
  • Re: Does strtok require a non-null token?
    ... In other words, if the string has 2 colons in a row, it ... Just one more reason to avoid strtok. ... For those requirements that strtok() does not fit we have our own ... internal tokenizing routines. ...
    (comp.lang.c)