Re: A C showstopper



On 2009-08-23, spinoza1111 <spinoza1111@xxxxxxxxx> wrote:
To develop the "unlimited string" processor which I discuss in the
thread A C Adventure, I have to recreate my utilities library of 1991,
which I did because then (and now) one needs to use the printf
approach to format data: but if one needs, as one often does need, to
format to storage, sprintf has a built in danger that as far as I know
(and correct me if I'm wrong) nobody has or will fix inside of C.

The problem is that any sprintf whatsoever, insofar as it formats
strings, has no control over string length. Most code I've seen
decides to create some silly "buffer" of some silly size.

But in

sprintf(buf, "%s\n", ptr)

characters past the allocated end of "buf" may be overwritten.
The C99 solution (limit the number of characters) is extraordinarily

The alternative to not limiting the number of characters is not to do the print
at all and report that it could not be done.

C99 provides both alternatives in one. The print is truncated, and the program
can tell that this happened.

poor and reinforces my Dim View of the ethics of people in that
effort: some of them participated in the "get Schildt" campaign, and
none of them seems to have been competent working programmers.

In 1991, I implemented the GNU solution, but I don't have the code
anymore. This is asprintf; its contract is to always allocate enough
memory to hold the final string.

Always?

What if the format string dynamically specifies a huge field width?

asprintf("%*s...", INT_MAX, ...)

what if there is no memory?

What if there is an overcomitting memory allocation going on?
The space appears to be allocated, but blows up when you try to use it?

``Always'' is a big word.

There are two ways of implementing asprintf. Either iterate formatting
until everything's formatted, reallocating larger and larger blocks of
memory (and copying previously formatted output characters).

If the reallocation is exponential, then each previously formatted
character is copied a constant number of times (on average over all
of the characters ever formatted). So the copying overhead is fixed,
and you do avoid performing conversions twice.

Or make a
pass through the data without doing output to find the size needed. My
1991 solution was the former. Today, when I write the code
I shall use the latter solution.

The latter solution can be written as a trivial wrapper around the C99 snprintf
function. Start with a zero-sized buffer and call snprintf (or rather
vsnprintf). The function won't try to store any characters, but it will still
return the number of characters that would have been written. Then, allocate a
buffer which can hold that many characters and call the function again.

Here it is:

#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>

char *asprintf(const char *fmt, ...)
{
int chars_written;

/* compute size without printing */
{
va_list vl;
char dummy_buf[1];
va_start (vl, fmt);
chars_written = vsnprintf(dummy_buf, 0, fmt, vl);
va_end (vl);
}

/* allocate and print */
{
va_list vl;
char *buf = malloc(chars_written + 1);
va_start (vl, fmt);

if (buf != 0)
vsnprintf(buf, chars_written + 1, fmt, vl);
va_end (vl);

return buf;
}
}

The GNU solution , not the C99 solution, is the one that occurs to the
competent programmer: the C99 solution cuts off the user at the knees
blindly and is the sort of solution that occurs to managers...not
competent programmers.

As you can see, the C99 solution can be used to implement the GNU solution,
using exactly the approach that you advocate as the one you would use today:
measure the buffer first without performing any output, allocate, then
do the output.
.



Relevant Pages

  • Re: Nmea 0183 question
    ... Sentence format is NMEA 0183. ... WeatherStation PB100 is predicted to operate on Windows OS via USB port. ... I have connected station to the serial port on my PC and i used HyperTerminal ... HyperTerminal i get whole bunch of puzzling characters as you can see on the link below: ...
    (sci.geo.satellite-nav)
  • Re: Mainframe programming vs the Web
    ... JavaScript) will allow you to generate the fields after the Card Type Radio ... of AJAX or JavaScript also allows Client Side checking of the Card Type (the ... first characters key to the Card Type) as well as the Check Digit Check. ... Refusing phone numbers in international standard format, ...
    (bit.listserv.ibm-main)
  • Nmea 0183 question
    ... Sentence format is NMEA 0183. ... WeatherStation PB100 is predicted to operate on Windows OS via USB port. ... I have connected station to the serial port on my PC and i used ... HyperTerminal i get whole bunch of puzzling characters as you can see on the ...
    (sci.geo.satellite-nav)
  • Re: ASCII Character to Text Conversion
    ... Does anyone know of any functionality that will convert the ASCII ... Please, everyone who does this, stop using the phrase "ASCII format." ... An "encoding scheme" is nothing more than a translation matrix, ... All ACSII characters can be represented with a single byte. ...
    (microsoft.public.vb.general.discussion)
  • Re: pointer arithmetic help
    ... char * kernel_buf; ... No storage has been allocated for the 5 characters you are trying to copy ... for user_buf the compiler allocates storage for the string. ... You must allocate storage... ...
    (comp.lang.c)