Re: to free() or not to free() in lex/yacc



Berk Birand wrote:
Hi,

I am working on a school project where we use lex/yacc to write a compiler
for a fictional (Java-like) language. I have handled all the details about
the yacc and lex files, but I still have a question regarding the dynamic
memory allocation for strings. When the lex file encounters a variable
name, I want it to pass this to yacc through yylval, and then
retrieve it to add to a syntax tree. For this purpose, I wrote the
following code in the lex file (third section), that will save the
variable in yylval:

void copy_name(char** dst, char* yy) {
int len;

//free(*dst);

// allocate memory for the new string
len = strlen(yy);
*dst = (char*) malloc(len * sizeof(char) + 1);
// copy the string
strcpy(*dst, yy);
}

It's basically a wrapper around strcpy(), which also allocates some memory
through malloc().

In the lex file, I have something like this
{LETTER}({LETTER}|{DIGIT}|"_")* {copy_name(&(yylval.Name) , yytext);
return(NAME);}

Now my question is whether I should keep the call to free in copy_name or
not. I would think that I need to do that, otherwise new memory would be
allocated each time a new variable is found, and I'd be facing memory
leaks. Yet when I uncomment that line, I start to get segfaults, for
reasons that I can't understand.

Can anybody with more expertise with lex/yacc help me out with this
problem?

Thank you,
Berk Birand


There are several solutions to this. Ordering from the easiest to the more
complex ones:

1) Use a garbage collector for C. Some compilers have it as standard in
their distributions (lcc-win32 for example). If not, google for
"Boehm's garbage collector". The advantage is that you only allocate
memory, never freeing it. Problems gone.
2) If that doesn't help, look again at your problem. Why do you want to
free the space allocated for the names? Maybe it is a much better
strategy to just keep allocating memory and free it all automatically
when your compiler exits. I have used this strategy in many parsers
or compilers that I use. The total amount of memory is small, and it
is used everywhere later in the program, so it is just a waste of
time to micro-manage each small piece of storage.
3) If that doesn't help, at the start of your program allocate a big
chunk of memory for names. When you are done with all names, free
all the memory for names in a single call to free. You specify a
buffer of say, 256K. You allocate names in this buffer. When you are
done with the parsing of names, you free the buffer, not caring about
the individual names.
If you run out of space, you can allocate those buffers in a linked
list.

have fun

jacob
.