Re: Boost process and C



On 2006-05-03, websnarf@xxxxxxxxx <websnarf@xxxxxxxxx> wrote:
CBFalconer wrote:
websnarf@xxxxxxxxx wrote:
CBFalconer wrote:
... snip ...
The last time I took an (admittedly cursory) look at Bstrlib, I
found it cursed with non-portabilities

You perhaps would like to name one?

I took another 2 minute look, and was immediately struck by the use
of int for sizes, rather than size_t. This limits reliably
available string length to 32767.

[snip]

[...] I did find an explanation and
justification for this. Conceded, such a size is probably adequate
for most usage, but the restriction is not present in standard C
strings.

Your going to need to conceed on more grounds than that. There is a
reason many UNIX systems tried to add a ssize_t type, and why TR 24731
has added rsize_t to their extension. (As a side note, I strongly
suspect the Microsoft, in fact, added this whole rsize_t thing to TR
24731 when they realized that Bstrlib, or things like it, actually has
far better real world safety because its use of ints for string
lengths.) Using a long would be incorrect since there are some systems
where a long value can exceed a size_t value (and thus lead to falsely
sized mallocs.) There is also the matter of trying to codify
read-only and constant strings and detecting errors efficiently
(negative lengths fit the bill.) Using ints is the best choice
because at worst its giving up things (super-long strings) that nobody
cares about,

I think it's fair to expect the possibility of super-long strings in a
general-purpose string library.

it allows in an efficient way for all desirable encoding scenarios,
and it avoids any wrap around anomolies causing under-allocations.

What anomalies? Are these a consequence of using signed long, or
size_t?

If I tried to use size_t I would give up a significant amount of
safety and design features (or else I would have to put more entries
into the header, making it less efficient).

If you only need a single "special" marker value (for which you were
perhaps using -1), you could consider using ~(size_t) 0.

Things will go wrong for at most one possible string length, but that's
more than can be said for using int.

But whatever the difference in efficiency, surely correctness and safety
first, efficiency second has to be the rule for a general-purpose
library?
.



Relevant Pages

  • Re: ends_with() ?
    ... > to check only the last strlencharacters before the string ... int ret = 1; ... int xstrendswith(const char *h, const char *n); ...
    (comp.lang.c)
  • Re: integer to characters
    ... but I tried to find itoa from The C Programming Language by ... reverse string s in place */ ... int c, i, j; ...
    (alt.comp.lang.learn.c-cpp)
  • Re: Boost process and C
    ... snip ... ... of int for sizes, rather than size_t. ... available string length to 32767. ... libraries in existence for C *today*. ...
    (comp.lang.c)
  • Re: Secure C programming
    ... So take the case of the string functions. ... my code uses strcpy to stuff it into a buffer. ... but efficiency in processing them. ...
    (comp.lang.c)
  • Re: 1st and 2nd.... positions
    ... It took your posting the idea of searching a string for the correct position to give me a nudge. ... Can the shorter main text string in the second formula really add significantly to the efficiency of performing the MID function call that it can compensate for the extra function call? ...
    (microsoft.public.excel.misc)