Re: run-time vs compile-time
From: newbiecpp (newbiecpp_at_yahoo.com)
Date: 09/09/04
- Previous message: Robert W Hand: "Re: where is the template instance?"
- In reply to: Jonathan Mcdougall: "Re: run-time vs compile-time"
- Next in thread: Niklas Borson: "Re: run-time vs compile-time"
- Reply: Niklas Borson: "Re: run-time vs compile-time"
- Reply: John Harrison: "Re: run-time vs compile-time"
- Reply: Jonathan Mcdougall: "Re: run-time vs compile-time"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Thu, 09 Sep 2004 13:27:53 GMT
"Jonathan Mcdougall" <jonathanmcdougall@DELyahoo.ca> wrote in message
news:URK%c.77536$fU6.1015196@wagner.videotron.net...
> > I have hard time to understand run-time environment.
>
> It is possible I messed up some things in my explanations, I am no
> compiler implementor nor an operating system developper. Wait for some
> corrections to appear before believing all that.
>
>
> Let's review the common steps :
>
> 1) Writing source code
> You specify, in a given language, the commands to be sent to the
> computer. In C++, that means creating classes and functions and
> using them. An object O, for example, is only a name, something
> for the programmer to make his life easier.
>
> Always think of a programming language in term of assembly
> language : the machine does not understand classes, objects,
> inheritance or whatever. All this high-level code is translated
> into machine code, so all your variable names, functions, cool
> class hierarchies will be replaced by memory locations, additions
> and substractions. In fact, high level languages like C++ could
> be viewed only as "sytactic sugar", as far as everything you do
> could be done in assembly language (that's what your code will
> become ultimately anyways).
>
> 2) Compiling
> Nothing really interesting here, syntax checking and traduction
> into intermediary code. The only thing here is that all names
> are checked, so for example
>
> // main.cpp
> int main()
> {
> a = 2;
> }
>
> will fail to compile since 'a' does not exist anywhere. The
> compiler makes sure every name is _potentially_ defined, so the
> linker will only have to try to find them. So when you write
>
> // main.cpp
> extern int a;
>
> int main()
> {
> a = 2;
> }
>
> the compiler tells the linker : "'a' refers to some int defined
> somewhere else, you'll have to find it. good luck".
>
> 3) Linking
> That's getting interesting. The linker makes sure everything is
> in place : each use of a name is resolved to its definition. If
> that definition does not exist or if it exists more than once,
> an error is generated (in most cases). The linker uses the
> information generated by the compiler to associate names. By
> using the example above, it searches other files (actually
> "translation units") for a definition of 'a'. If it finds it,
> for example with
>
> // another file.cpp
> int a=0;
>
> , it associates 'a' in main.cpp with the 'a' in file.cpp. If 'a'
> is not found anywhere, it stops. If two or more 'a's are found,
> it stops since there is ambiguity. The the basis of the ODR
> (One Definition Rule).
>
> The linker then translates all data access by an address, but
> don't get it wrong : that address is not the real address in
> memory. Actually, you could see that address as an offset in the
> real memory. When the operating system runs the program, it
> loads it somewhere and reserves some space in memory for that
> program. Each variable/address/offset is resolved by the OS to
> that memory space.
>
> How? An easy answer could be "automagically". The real long
> answer perhaps could be provided by someone in a newgroup
> supporting your operating system. The thing is, that kind of
> detail is left to the implementation : the C++ standard only
> mandates _behavior_, not implementation. So as long as it
> behaves as specified, an implementation can do anything it wants.
>
> You must understand that the only time memory is allocated, and
> therefore real addresses are defined, is when the program is
> executed, not during compilation or linking.
>
> Please, note that this is a bit of oversimplification.
>
>
> > Let assume that I have
> > a program that has a simple variable alpha. When this variable is
> > statically allocated, the compiler can use the absolute address of alpha
to
> > access to it.
>
> What's important to understand is that nothing is allocated when you
> compile/link a program. The program merely becomes some machine code.
> The operating system is then in charge of running it. Yes, the linker
> assigns some addresses to your variables, but these do not refer to a
> specific place in memory. As I said, you could consider these
> "addresses" to be offsets in memory, the starting address being defined
> by the operating system.
>
> For what you said specifically, know that static data is handled
> differently by most implementations. Actually, three types of data are
> typically recognized : the stack, the heap and the static data.
>
> After re-reading, what do you mean by "statically allocated"? On the
> stack such as
>
> int main()
> {
> int i = 0;
> }
>
> or really statically allocated such as
>
> int main()
> {
> static int i = 0;
> }
>
> Be sure to make the difference between the two.
>
> > What confuses me is that when the variable is dynamically
> > allocated, how does the compiler implement it?
>
> That's something else, though not entirely different. What I described
> about the linker only applies to the stack. For the heap, it works a
> bit differently.
>
> The operating system manages a pool of memory typically called "heap" or
> "free store". That memory is different from the stack because, first of
> all, of its longevity. For example, by using the stack :
>
> void f()
> {
> int i=0; // i is on stack
> }
>
> you have no way of extending the life of 'i'. Once the function
> terminates, 'i' is destroyed. This is mandated by the C++ standard and
> it is the usual practice in all languages, including assembly, so it's
> no big deal.
>
> The heap however is a pool of memory. You can reserve and release
> memory when you want. Typically, in pratice, the operating system has
> some functions for allocating and deleting memory at low-level,
> typically in big chunks (several kilobytes). These functions are then
> used by the malloc() system, which usually maintains another pool of
> memory itself. Finally, operator new is usually implemented in terms of
> malloc().
>
> The low level OS functions return the address of the allocated memory on
> the heap. That address can be absolute (which is pretty rare today) or
> relative, allowing in particular some sort of protection. Relative
> addressing (also called virtual addressing) behaves like the relative
> stack address I described earlier.
>
> So actually, what the OS returns is only an address, so neither the
> compiler or the linker has nothing to do with that. That address is
> determined by the OS at run-time, depending on the loaded programs and
> the content of the heap. These functions can fail if the heap is full.
>
> > We know the address of the
> > variable until run-time.
>
> That should read : "We don't know the real address of the variable until
> run-time", since the program is not running. For memory to be
> allocated, the program must run! Compiling it only translates it into
> machine code and assigns "dummy" addresses which have to be resolved by
> the operating system.
>
> > During the compilation, how can we access to the
> > alpha variable since we don't know its address yet?
>
> As I said earlier, the linker uses some kind of relative address which
> is resolved by the operating system. If you are asking how the compiler
> /linker do with
>
> void f(int &i)
> {
> i = 2;
> }
>
> to know what 'i' refers to, well it is only a matter of keeping a list
> of the variables in a given scope with their names. When you do
>
> int main()
> {
> int a = 10;
> f(a);
> }
>
> The compiler enters 'a' in its list for main() and associates 'i' in f()
> with it. That's a simple assocation map. Once every name has been
> resolved, the linker only has to take that list of variable and give
> them addresses. The operating system is then in charge, when running
> the program, of allocating memory and resolving the addresses made by
> the linker to the real addresses in memory.
>
> It is important for you to understand that all the things I just
> explained are not specified by the C++ standard, and therefore you
> cannot rely on it, altough it is commonly implemented that way.
> Remember : C++ describes behavior, not implementation.
>
>
> Jonathan
Thank you very much. I really appreciate your time and help.
My confusion is from here: C++ says that static allocation, such as
static int i;
is binding at compile-time, while dynamic allocation, such as
int* pi = new int;
is binding at run-time. I understand now that for i, compiler can put an
offset related to some location (like stack base) somewhere. But I still
have some confusion about dynamic binding. To me, the compiler may put an
offset from heap to pi. But why we call it run-time binding? To me, they
all decide at compile-time, that is, compiler record an offset from stack or
heap. During run-time, the OS will decide the addresses of stack and heap
so that we can have real addresses of i and pi. But why we call one as
compile-time binding and the other as run-time binding? Or my understanding
was wrong?
I appreciate your insight and your time.
- Previous message: Robert W Hand: "Re: where is the template instance?"
- In reply to: Jonathan Mcdougall: "Re: run-time vs compile-time"
- Next in thread: Niklas Borson: "Re: run-time vs compile-time"
- Reply: Niklas Borson: "Re: run-time vs compile-time"
- Reply: John Harrison: "Re: run-time vs compile-time"
- Reply: Jonathan Mcdougall: "Re: run-time vs compile-time"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|