Re: Cpp Considered Harmful

From: Steven T. Hatton (susudata_at_setidava.kushan.aa)
Date: 09/02/04


Date: Thu, 02 Sep 2004 12:38:39 -0400

Kai-Uwe Bux wrote:

> Steven T. Hatton wrote:
>

>> It's not contradictory, it is simply recursive.
>
> It is neither, the technical term for the way you used the word
> "information" in your argument is, I think, "equivocation".

What I mean by recursive is that I'm talking about applying the tools
created by information technology to the tool that create these tools.

>
>> I am applying the same
>> argument a mechanical engineer would use to explain his use of computers
>> to
>> do his job. The programs and tools a software engineer uses are the
>> programming language, the compiler, the IDE, the libraries, etc.
>
> I do not understand this analogy at all.

A mechanical engineer uses computers to explore his designs, to access
information, to organize his resources, etc. That's non-recursive in the
sense that a mechanical engineer is not creating IT tools. When I, as a
software engineer, apply information technology to my work it is recursive.
Self-referential, if you will.
 
> There are mitigating strategies that I incorporate in my coding styles. A
> problem is, of course, that I have to rely on libraries that might not
> follow my coding style. Nonetheless, I feel that a lot can be done without
> changing the language.

I agree. But I still find #inclusion redundant, primitive, inelegant,
potentially confusing, and lots of other bad things.
 
>> But what power do you lose by being able to import a resource with a
>> using declaration, or to bring in -perhaps implicitly- an entire
>> namespace with a
>> using declaration? That really is what I am suggesting, more than
>> anything.
>
> I started to comment on this, and I realized that I was about to write
> nonsense. So I realized that I do not understand your proposal. Could you
> point me to a post where you have given more details about what these
> using directives should do from a users point of view; obviously I missed
> some postings of yours. Before I have a firm grasp of the proposed
> addition, I cannot estimate what would have to be sacrificed (if anything)
> to make it work -- after all this mechanism has to be integrated with the
> rest of C++.

This is from a previous post. It isn't as concise as I would like, but I
need to work on clarifying in my own mind exactly what I'm suggesting
before I try to formaize it for others. All of what follows might be
replace by the requirement that 'Given a fully qualified identifier in a
source file, the implementation shall locate the declaration, (and
definition if needed), and make it available so that the source file can be
successfully compiled.' How it does this? I don't care!

//-----------------------------------------------------------------------
All of this is correct.  But I'm not sure that's the most problematic aspect
of the CPP.  Though the CPP and its associated proprocessor directives do
constitute a very simple language (nowhere near the power of sed or awk),
it obscures the concept of translation unit by supporting nested #includes.
When a person is trying to learn C++, the additional complexity can
obfuscate C++ mistakes.  It's hard to determine if certain errors are CPP
or C++ related.

IMO, the CPP (#include) is a workaround to compensate for C++'s failure to
specify a mapping between identifiers used within a translation unit and
the declarations and definitions they refer to.

As an example let's consider the source in the examples from
_Accelerated_C++:_Practical_Programming_by_Example_ by Andrew Koenig and
Barbara E. Moo:
 
http://acceleratedcpp.com/

// unix-source/chapter03/avg.cc
#include <iomanip>
#ifndef __GNUC__
#include <ios>
#endif
#include <iostream>
#include <string>

using std::cin;                  using std::setprecision;
using std::cout;                 using std::string;
using std::endl;                 using std::streamsize;

I chose to use this as an example because it's done right (with the
exception that the code should have been in a namespace.) All identifiers
from the Standard Library are introduced into the translation unit through
using declarations.  Logically, the using declaration provides enough
infomation to deterministically map between an identifier, and the
declaration it represents in the Standad Library.  The #include CPP
directives are necessary because ISO/IEC 14882 doesn't require the
implementation to resolve these mappings.  I believe - and have suggested
on comp.std.c++ - that it should be the job of the implementation to
resolve these mappings.

Now a tricky thing that comes into play is the relationship between
declaration and definition.  I have to admit that falls into the category
of religious faith for me.  Under most circumstances, it simply works, when
it doesn't I play with the environment variables, and linker options until
something good happens.

I believe what is happening is this: When I compile a program with
declarations in the header files I've #included somewhere in the whole
mess, the compiler can do everything that doesn't require allocating memory
without knowing the definitions associated with the declarations.
(by compiler I mean the entire CPP, lexer, parser, compiler and linker
system) When it comes time to use the definition which is contained in a
source file, the source file has to be available to the compiler either
directly, or through access to an object file produced by compiling the
defining source file.

For example, if I try to compile a program with all declarations in header
files which are #included in appropriate places in the source, but neglect
to name one of the defining source files on the command line that initiates
the compilation, the program will "compile" but fail to link.  This results
in a somewhat obscure message about an undefined reference to something
named in the source.  I believe that providing the object file resulting
from compiling the defining source, rather than that defining source
itself, will solve this problem.

The counterpart to this in Java is accomplished using the following:

* import statement

* package name

* directory structure in identifier semantics

* classpath

* javap

* commandline switches to specify source locations

Mapping this to C++ seems to go as follows:

* import statement

This is pretty much the same as a combination of a using declaration and and
a #include.  A Java import statement looks like this:

import org.w3c.dom.Document

In C++ that translates into something like:

#include <org/w3c/dom/Document.hh>
using org::w3c::dom::Document

* package name

This is roughly analogous to the C++ namespace, and is intended to support
the same concept of component that C++ namespaces are intended to support.
In Java there is a direct mapping between file names and package names.
For example if your source files are rooted at /java/source/ (c
\java\source) and you have a package named org.w3c.dom the name of the file
containing the source for org.w3c.dom.Document will
be /java/source/org/w3c/dom/Document.java. Using good organizational
practices, a programmer will have his compiled files placed in another,
congruent, directory structure, e.g., /java/classes/ is the root of the
class file hierarchy, and the class file produced by
comepiling /java/source/org/w3c/dom/Document.java will reside
in /java/classes/org/w3c/dom/Document.class.  This is analogous to placing
C++ library files in /usr/local/lib/org/w3c/dom
and /usr/local/include/org/w3c/dom.  

* directory structure in identifier semantics

In Java the location of the root of the class file hierarchy is communicated
to the java compiler, and JVM using the $CLASSPATH variable.  In C++ (g++)
the same is accomplished using various variables such as $INCLUDE_PATH
(IIRC) $LIBRARY_PATH $LD_LIBRARY_PATH and -L -I -l switches on the
compiler.

Once Java know where the root of the class file hierarchy is, it can find
individual class files based on the fully qualified identifier name.  For
example:

import import org.w3c.dom.Document

means go find $CLASSPATH/org/w3c/dom/Document.class

The C++ Standard does not specify any mapping between file names and
identifiers.  In particular, it does not specify a mapping between
namespaces and directories.  Nor does in specify a mapping between class
names and file names.

* classpath

As discussed above the $CLASSPATH is used to locate the roots of directory
hierarchies containing the compiled Java 'object' files.  To the compiler,
this functions similarly to the use of $LIBRARY_PATH for g++.  It also
provides the service that the -I <path/to/include> serves in g++

* javap

The way the include path functionality of C++ is supported in Java is
through the use of the same mechanism that enables javap to provide the
interface for a given Java class.

For example:

Thu Aug 19 09:40:27:> javap org.w3c.dom.Document
Compiled from "Document.java"
interface org.w3c.dom.Document extends org.w3c.dom.Node{
    public abstract org.w3c.dom.DOMImplementation getImplementation();
   ...  
    public abstract org.w3c.dom.Attr createAttribute(java.lang.String);
       throws org/w3c/dom/DOMException
...
}

What Javap tells me about a Java class is very similar to what I would want
a header file to tell me about a C++ class.

* commandline switches to specify source locations
This was tacked on for completeness.  Basically, it means I can tell javac
what classpath and source path to use when compiling.  If a class isn't
defined in the source files provided, then it must be available in compiled
form in the class path.

One final feature of Java which makes life much easier is the use of .jar
files.  A C++ analog would be to create a tar file containing object files
and header associated header files that compilers and linkers could use by
having them specified on the commandline or in an environment variable.

I know there are C++ programmers reading this and thinking it is blasphemous
to even compare Java to C++.  My response is that Java was built using C++
as a model.  The mechanisms described above are, for the most part, simply
a means of accomplishing the same thing that the developers of Java had
been doing by hand with C and C++ for years.  There is nothing internal to
the Java syntax other than the mapping between identifier names and file
names that this mechanism relies on.  This system works well. The world will
be a better place when there is such a thing as a C++ .car file analogous
to a Java .jar file.  Grant that these will not be binary compatable from
platfor to platform, but in many ways that doesn't matter.
//-----------------------------------------------------------------------
>
>> Its your issue, not mine. The only reason I mentioned exceptions is that
>> it is the only place where I am advocating placing restrictions on what
>> you
>> can do by default in C++. There are many restrictions placed on your use
>> of code. That's what type checking is all about. If you are suggesting
>> that I want to see changes in the way C++ is used in general, yes, that
>> is correct.
>
> Do you suggest changes to the standard? I am not concerned with any effort
> of yours to change "the way C++ is used in general" (a cultural change),
> because it would be up to me to follow the crowd or not. Changes to the
> standard are what I am concerned about. If you do not talk about those,
> then I apologize for the misunderstanding.

Yes, I am suggesting changes to the standard be considered. I've already
suggested the exception mechanism be changed.

>>> Maybe, if cpp was even more powerful and more convenient, a superior
>>> library management could be implemented using macros.
>>
>> I actually have a wishlist item in the KDevelop bug database that
>> suggests attempting such a thing using #pragma and a strategy of
>> filename, resource name consonance.
>>
>
> Sounds interesting and cool.

I recently found there is more infrastructure in the code base for KDevelop
which might facilitate this than I previously thought. This in particular:
http://www.kdevelop.org/HEAD/doc/api/html/classAST.html

I have the sense that Roberto Raggi's AST might provide a good foundation
for an entire C++ compiler. In comparison to what I saw in the gcc source
code, Roberto's seems a lot cleaner.

-- 
"[M]y dislike for the preprocessor is well known. Cpp is essential in C
programming, and still important in conventional C++ implementations, but
it is a hack, and so are most of the techniques that rely on it. ...I think
the time has come to be serious about macro-free C++ programming." - B. S.


Relevant Pages

  • Re: wie Array für statische Methoden
    ... > auf gar keinen Fall Java empfehlen;) ... Ich hab mal mit virtuellen Methoden auf einem Microkontroller experimentiert und bin zu dem Schluss gekommen: ... > Es ist ja gerade der Vorteil bei Java einen Compiler ... > Dinge zu optimieren, ...
    (de.comp.lang.java)
  • Re: What is the fastest method of parsing scheme?
    ... These issues can be eliminated by the use of custom memory allocators ... Any other ideas why Scheme would be faster than C++ and Java for heap ... For example, in my compiler, the procedure ) ...
    (comp.lang.scheme)
  • Re: A 21st Century Apple II?
    ... Java 6 -Xms64m 24.00s ... Which is actually 9.6% better than the C++ Intel compiler. ... I'm sure in some cases the GNU compiler produces better results. ...
    (comp.sys.apple2)
  • Re: C to Java Byte Code
    ... > Java) cannot do, and you cannot write a compliant Java engine that can ... "I think you would agree with me that a C compiler that directly ... compliant ANSI C compiler are described in MPC's product description." ... these "minor differences" include this requirement: ...
    (comp.programming)
  • Re: Newbie questions (four of them)
    ... > C much more difficult than I did the first semester of Java. ... examples set out by their respective Standard Libraries. ... try to understand the compiler chain. ... distinction between the "declaration" and the "definition" of something; ...
    (comp.lang.c)