Re: header files

From: Karl Heinz Buchegger (kbuchegg_at_gascad.at)
Date: 04/05/04


Date: Mon, 05 Apr 2004 10:01:04 +0200

mike ttoouli wrote:
>
> Hi
>
> I'm a little confused about the reason for using both header and cpp
> files in a project. As all code can be written in the header file
> itself, why use cpp files at all?
>
> I can see there being a redeclaration issue if a header file is
> included multiple times in a large project, but can't this be
> by-passed with #ifndef /#define /#endif ?
>
> I know this is probably very bad form - can anyone tell me the reasons
> for the .h / .cpp system?
>

A few weeks ago I have written another reply for somebody else concerning
header files, libraries, compilers and linkers and the way they work
together. Maybe it is of some use to you:

*******************************************************************************************

First of all let me introduce a few terms and clearify
their meaning:

source code file The files which contains C or C++
                           code in the form of functions and/or
                           class definitions

header file Another form of source file. Header files
                           usually are used to seperate the 'interface'
                           description from the actual implementation
                           which resides in the source code files.

object code file The result of feeding a source code file through
                           the compiler. Object code files already contain
                           machine code, the one and only language your computer
                           understands. Nevertheless object code at this stage
                           is not executable. One object code file is the direct
                           translation of one source code file und thus usually
                           lacks external references, eg. the actual implementation
                           of functions which are defined in other source code files.

library file a collection of object code files. It happens frequently that
                           a set of object code files is always used together. Instead
                           of always listing all those object code files during the
                           link process it is often possible to build a library from
                           them and use the library instead. But there is no magic
                           with a library. A library can be seen as some repository
                           where one can deposit object code files such that the library
                           forms a collection of them.

compiling the process of transforming the source code files into
                           object code file. C and C++ define the concept of 'translation
                           unit'. Each translation unit (normally: one single source code
                           file) is translated independently of all other translation units.

linking the process of combining multiple object code files and libraries
                           into an executable. During the linking process all external references
                           of one object code file are examined and the linker tries to find
                           modules which satisfy those external references.

In practice the whole process works as follows:
Say you have 2 source files (with errors, we will return to them later)

main.c
******

int main()
{
  foo();
}

test.c
******

void foo()
{
  printf( "test\n" );
}

and you want to create an executable. The steps are
as in the graphics:

       main.c test.c
       +----------------+ +-----------------------+
       | | | |
       | int main() | | void foo() |
       | { | | { |
       | foo(); | | printf( "test\n" ); |
       | } | | } |
       +----------------+ +-----------------------+
               | |
               | |
               v v
           ********** **********
          * Compiler * * Compiler *
           ********** **********
               | |
               | |
               | |
      main.obj v test.obj v
      +--------------+ +--------------+
      | machine code | | machine code |
      +--------------+ +--------------+
               | |
               | |
               +------------------+ +--------------------+
                                  | |
                                  v v
                             ************* Standard Library
                            * Linker *<----------+--------------------+
                             ************* | eg. implementation |
                                    | | of printf or the |
                                    | | math functions |
                                    | | |
                                    | +--------------------+
                         main.exe v
                         +-------------------------+
                         | Executable which can |
                         | be run on a particluar |
                         | operating system |
                         +-------------------------+

So the steps are: compile each translation unit (each source file) independently
and then link the resulting object code files to form the executable. To do that
misssing functions (like printf or sqrt) are added by linking in a prebuilt library
which contains the object modules for them.

The important part is:
Each translation unit is compiled independently! So when the compiler compiles
test.c it has no knowledge about what happend in main.c and vice versa. When the
compiler tries to compile main.c it eventually reaches the line
       foo();
where main.c tries to call function foo(). But the compiler has never heared about
a function foo! Even if you have compiled test.c prior to it, when main.c is
compiled this knowledge is already lost. Thus you have to inform the compiler
thar foo() is not a typing error and that there indeed is somewhere a function
called foo. You do this with an function prototype:

      main.c
       +----------------+
       | void foo(); |
       | |
       | int main() |
       | { |
       | foo(); |
       | } |
       +----------------+
               |
               |
               v
           **********
          * Compiler *
           **********
               |

Now the compiler knows about this function and can do its job. In very much the same way
the compiler has never heared about a function called printf(). printf is not part of
the 'core' language. In a conforming C implementation it has to exist somewhere, but
printf() is not on the same level as 'int' is. The compiler knows about 'int' and
what it means, but printf is just a function call and the compiler has to know its
parameters and return type in order to compile a call to it. Thus you have to inform
the compiler of its existence. You could do this in very much the same way as you
did it in main.c, by writing a prototype. But since this is needed so often and
there are so many other functions available, this very fast gets boring and error prone.
Thus somebody else has already provided all those protoypes in a seperate file, called
a header file, and instead of writing the protoypes by yourself, you simply 'pull in'
this header file and have them all available:

                    test.c
                    +-----------------------+
                    | #include <stdio.h> |<-+
                    | | |
                    | void foo() | |
                    | { | |
                    | printf( "test\n" ); | |
                    | } | |
                    +-----------------------+ |
                           | |
                           | |
                           v |
                      ********** stdio.h v
                     * Compiler * +-------------------------------------+
                      ********** | ... |
                           | | int printf( const char* fmt, ... ); |
                                      | ... |
                                      +-------------------------------------+

And now the compiler has everything it needs to know to compile test.c
Since main.c and test.c could have been compiled successfully they can be linked
to the final executable which can be run. During the process of linking the linked
figures out that there is a call to foo() in main.obj. Thus the linker tries to find
a function called foo. It finds this function by searching through the object
module test.obj. The linker thus inserts the correct memory address for foo
into main.obj and also includes foo from test.obj into the final executable. But
in doing so, the linker also figures out, that in function foo() there is a call
to printf. The linker thus searches for a function printf. It finds it in the
standard library, which is always searched when linking a C program. There the
linker finds a function printf and this function thus is included into the
final executable too. printf() by itself may use other functions to do its
work but the linker will find all of them in the standard library and include
them into the final executable.

There is one thing left to talk about. While main.c is correct from a technical
point of view it is still unsatisfying. Imagine that our functoni foo() has
a much more complicated argument list. Also imagine that your program does not
consist of just those 2 translation units but instead has 100-dreds of them and
that foo() needs to be called in 87 of them. Thus you would have write a prototype
in every single one of them. I think I don't have to tell you what that means: All those
prototypes need to be correct and just in case function foo() changes (things like
that happen), all those 87 prototypes need to be updated. So how can you do that?
You already know the solution, you have used it already. You do pretty much
the same as you did in the case of stdio.h. You write a header file and
include this instead of the prototype:

       main.c
       +-------------------+ test.h
       | #include "test.h" |<---------+-------------+
       | | | void foo(); |
       | int main() | | |
       | { | +-------------+
       | foo(); |
       | } |
       +-------------------+
               |
               |
               v
           **********
          * Compiler *
           **********
               |

Now you can include that header file in all the 87 translation units which
need to know about foo(). And if the prototype for foo() needs some update
you do it in one central place: by editing file test.h. All 87 translation
units will pull in this updated protype when they are recompiled.

HTH

-- 
Karl Heinz Buchegger
kbuchegg@gascad.at


Relevant Pages

  • Re: pid_t data type
    ... That's because the compiler generates machine code for the printf() ... pid_t foo; ... int, or cast the expression to an int. ...
    (comp.unix.programmer)
  • Re: Summary: translation units, preprocessing, compiling and linking?
    ... void foo() ... compile each translation unit independently ... So when the compiler compiles ... a header file, and instead of writing the protoypes by yourself, you simply 'pull in' ...
    (comp.lang.cpp)
  • Re: Headers!
    ... void foo() ... So when the compiler compiles ... the compiler has never heared about a function called printf(). ... did it in main.c, by writing a prototype. ...
    (alt.comp.lang.learn.c-cpp)
  • Re: C++ Project Files?????
    ... But this is dependent on your compiler vendor. ... You use another program, a librarian, to collect multiple *.o files ... void foo() ... the compiler has never heared about a function called printf(). ...
    (comp.lang.cpp)
  • Re: Quines
    ... No header file is included and the code works (Compiled ... assumed that printf() is an ordinary function, not a variadic one, ... source code file fed to the compiler. ...
    (comp.lang.c)