Re: embedded questions!!!



On 13 Jan 2006 21:38:47 GMT, "John B"
<spamj_baraclough@xxxxxxxxxxxxxxxxxxx> wrote:

>On 13/01/2006 the venerable Jonathan Kirwan etched in runes:
>
>> On Fri, 13 Jan 2006 20:23:02 GMT, I wrote:
>>
>> > <snip>
>> > Would a c compiler be allowed to "fold" these two constant arrays so
>> > that they occupy the exact same memory?
>> > <snip>
>>
>> Actually, I think folding two strings together (which is an option on
>> some compilers) may be against the standard. So that part of the
>> question may not be relevant. But the subtler question remains as to
>> whether or not the string initializers themselves should be considered
>> unmodifiable by a programmer. I'd argue that they must be considered
>> unmodifiable, as some operating systems/compilers may place these
>> constants in "program text areas" or otherwise in read-only protected
>> memory. So that when saying:
>>
>> char *s1= "hello";
>>
>> you are doing something like,
>>
>> char *s1= (char *) ((const char []) { "hello" });
>>
>> Jon
>
>We must also remember that C was devised for use on a machine with
>Von-Neuman architecture.

The PDP-11, for one.

>Many modern microcontrollers use a Harvard architecture

I'm intimately aware of that.

>and this presents the compiler writers with the
>problem of where to put unmodifiable data.

Some thoughts on that:

The problem presented with any compiler used for dedicated embedded
situations is all the work it takes to meet the spec before arriving
at main(). At the time c was being developed, the current thinking
about running program environments included the following functional
classifications:

Segment Name Segment Description
-------------------------------------------------
CODE Code section
CONST Constant data section
INIT Initialized data section
BSS Uninitialized data section
HEAP Heap section
STACK Stack section

[Actually, the very concept of 'stack' as a general purpose workhorse
will still in its childhood -- many of the existing and commercially
successful machines did NOT support, via hardware, the idea of a stack
and a great deal of code had been written completely without them
except as a specialized concept for certain problems. (I worked on
such operating systems and languages.) Heap was kind of new, too. The
PDP-11 was only just out around 1970, or so, to light the way out of
the darkness. :) ]

In Von Neumann, all of these are in the same memory addressing system.
Modern concepts weren't completely worked out and the PDP-11 included
support for several equally good conventions. But the gist of the
above is that stack would grow down, heap grow up, and that the other
four sections were each of fixed size at start. Only the CODE, CONST
and INIT sections needed to be kept "on disk" or in some form of
non-volatile storage (which could, of course, include cards, tape, or
whatever.) Neatly, the non-volatile portions are all of fixed size.

In other words, like this:

Section Description Access NV? Size
=================================================================
Code Execute Yes Fixed/static
Constants Read Yes Fixed/static
Initialized Data Read/Write Yes Fixed/static
Uninitialized Data Read/Write No Fixed/static
Heap Read/Write No Variable, up
Stack Read/Write No Variable, down

If you look at the above list and think about Harvard architectures
and c programming generally, you find that the code must be placed in
code memory while the others must all be placed in data memory. on
Von Neumann, this is the same memory system. On Harvard, two
different ones, at least.

But this is NOT a problem, really. Even in the Harvard case. In
fact, it's not too far from how an operating system would do it under
the Intel 80386 and above, if it wanted to implement an execute-only
region for the code (you can't read it as data.) And the 80386 is NOT
Harvard.

Also, keep in mind that it is still the case that the first three must
be stored somewhere in non-volatile memory. In the case of Von
Neumann systems with flash on-chip, this is fairly easy -- just place
it there. Both code and data can be accessed without having to move
any of it around.

A question for c in embedded use for Harvard comes in the use of
pointers. A pointer to code memory may NOT occupy the same memory
footprint (in other words, the sizeof() the two pointer types may be
different) and the actual instructions used to access these different
types of memory may be different. The different size can be fixed, by
requiring the larger of the two sizes for all (in other words, making
a union.) And code generation can simply depend on the declaration of
the pointer. I believe casting can also be handled. So, frankly,
neither of these are insurmountable and it is quite possible for a c
compiler to accept straight c code and generate functioning programs
on Harvard machines without special decorations/declarations.

For Harvard, a re-definition of the Von Neumann layout is in order, if
you want to be able to port code as easily as to another Von Neumann
system. Something like these functional areas:

Segment Name Segment Description
-------------------------------------------------
CODE Code section
CONST_copy Data for constant section
INIT_copy Data for initialized data section
CONST Constant data section
INIT Initialized data section
BSS Uninitialized data section
HEAP Heap section
STACK Stack section

In this case, the first three must be placed in non-volatile memory --
flash, for example. And the remaining can be placed in volatile. At
start, pre-main() code copies CONST_copy into CONST and INIT_copy into
INIT before starting main(). If this is done, then once again all
data memory is accessible as data. And Harvard works consistently
with c's model, I think, in this case.

For embedded Harvard processors -- the only difficult problem in the
above is if there are _no_ instructions which can read from code space
and if the code space is the only non-volatile memory present. In
such cases, I believe, space will have to be reserved in data memory
and code instructions must be able to use immediate-mode constants
they can load into registers and then place into data memory to
initialize them to specific values. That would be painful, but
doable, if instructions support some form of immediate mode and there
is enough code space, of course.

So the bottom line, I think, is that Harvard really isn't exactly an
insurmountable problem for c compilers accepting unvarnished c code,
granting an instruction type or two on the target.

That is, until you worry about practical things like scarce resources
-- such as RAM. It is one thing that INIT_copy needs to be copied
into INIT. There is no avoiding the need to use RAM for initialized
data that can be later modified. It has to be in RAM. But CONST_copy
must also then be copied into RAM where it can be accessed as data and
it would be nice if, instead, those constants could just remain in the
code space and not take up RAM resources at run-time.

So that can be a bad thing. There may not be very much RAM to go
around. So suddenly you have a strong desire to access data that sits
in code space (if data space doesn't include non-volatile memory.) But
if you cave into that desire then you may have another problem,
passing around pointers to data which may be either in code space or
data space. In that case, you either need to fashion data pointers
which support both (and that will likely balloon the code as well as
slow execution time) or else have the compiler emit code for one kind
of access where that routine then cannot accept pointers to the other
space. I was faced with this problem, for example, using the PIC
chips -- in a routine that was basically a "printf()" accepting
strings which could _either_ be in a RAM buffer _or_ constant literals
located in code space.

So vendors do expand things by adding type qualifiers or #pragma
statements. But mostly to be competitive and sell their product --
not so much because they absolutely have to -- Harvard can be painful
for vanilla c and it can be uncompetitive with decorations added, but
I don't think it is impossible in principle.

Jon
.



Relevant Pages

  • Re: PIC vs ARM assembler (no flamewar please)
    ... At least architectures with different instruction word and data word ... length must be harvard architecture. ... They would likely be Harvard indeed. ... 16-bit reads to a 14-bit instruction memory. ...
    (comp.arch.embedded)
  • Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
    ... So you agree table lookup can read code memory. ... The bottom line is there are 3 Harvard variants: ... I really think you need to do some hardware design and maybe some CPU ...
    (comp.arch.embedded)
  • Re: uC for Indirect Execution
    ... People have been setting up memory chip selection ... circuitry for Harvard CPUs to behave like von-Neumann ones since the ... all current PC CPUs are actually Harvard ... Wescott Design Services ...
    (comp.arch.embedded)
  • Re: New ARM Cortex Microcontroller Product Family from STMicroelectronics
    ... So you agree table lookup can read code memory. ... The bottom line is there are 3 Harvard variants: ... I really think you need to do some hardware design and maybe some CPU ...
    (comp.arch.embedded)
  • Re: PIC vs ARM assembler (no flamewar please)
    ... multiple busses means "harvard" even thought there's only ... At least architectures with different instruction word and data word ... space or even into a unified memory array. ... byte addressability in the data space would also be a sign of harvard ...
    (comp.arch.embedded)