Re: Cpp Considered Harmful

From: Steven T. Hatton (susudata_at_setidava.kushan.aa)
Date: 08/31/04


Date: Tue, 31 Aug 2004 04:06:58 -0400

Paul Mensonides wrote:

> Steven T. Hatton wrote:
>
>>> It buys you meaningful assertions with replicating code among other
>>> things.
>>
>> Perhaps this is useful. I have never tried using assertions. When I
>> read about them in TC++PL(SE) it basically went. 'Check this
>> assertion thing out. Pretty cool, eh? They're macros. They suck!'
>
> Assertions are invaluable tools.

Some people seem to think so. I read up on them in both Java and C++, and
was also aware of them in C. They never seemed to be much use. I'll grant
you, with the weak exception handling of C++ such a thing might be a bit
handy. Too bad C++ doesn't have printStackTrace. I can't even think of a
problem I've had where such a mechanism would be of much use.

>> There are places where retaining some of the Cpp for the
>> implementation to use seems to make sense. For example the various
>> __LINE__, __FILE__, __DATE__, etc. are clearly worth having around so
>> debuggers and other tools can use them.
>
> Debuggers? Macros don't exist after compilation, so I'm not sure that
> __DATE__ and __TIME__ would have any useful meaning at all, and debuggers
> (if source information is available) already know the line and file.

Oh well, one less argument in favor of keeping the preprocessor.

>> They should not be
>> considered part of the core language to be used by application
>> programmers.
>
> An immediate example comes to mind. What if I write a program that, when
> executed with a --version option, outputs a copyright notice, version
> number,
> and the build date and time? (This is a fairly common scenario, BTW.)

This kind of thing?

"GNU Emacs 21.3.50.2 (i686-pc-linux-gnu, X toolkit, Xaw3d scroll bars) of
2004-08-30 on ljosalfr"

Nothin' a bit of sed and date in the Makefile won't do for you.

>> That seems like an ambiguity that could be resolved by examining the
>> parameter when necessary.
>
> Yes, but that means that the IDE has to be able to parse and semantically
> analyze the entire source code--including doing overload resolution,
> partial ordering, template declaration instantiation, etc..

No it doesn't. Even if it were necessary to do all you said, I don't
believe it is beyond the capabilities of existing technology. But, as I've
already explained, it isn't necessary to parse everything at edit time in
order for such a tool to be useful. They can often rely on the results of
a previous compilation.
 
Oh, and I just checked. KDevelop is giving me code completion on templates.

>> I agree. However, the presence of the CPP complicates this issue to
>> the point where it seems to degenerate into an exercise in absurdity.
>
> I think you have a serious misunderstanding of the preprocessor. How does
> CPP complicate this issue?

Because it supports the antiquated technique of pasting together a
translation unit out of a bunch of different files, and it modifies the
source code between the time the program is edited and the time it is
actually compiled.
 
>> I will observe that many Java IDEs do this rather successfully.
>
> Parsing Java is quite a bit simpler than parsing C++.

Some of that is due to a simpler grammar, and some (much) of it is due to
the fact that Java uses a superior mechanism for locating resources
external to the actual file containing the source under development.

> It doesn't know all of the specializations. As a general rule, it only
> knows about a few general specializations.

What is "it"? I'm not following you here. The specialization are defined
somewhere in the code base, so they can be cached like anything else.
 
>>> The answer is "nothing" because it cannot know what T is, and
>>> therefore cannot tell what specialization of X is chosen, nor can it
>>> even tell what X specializations there might be.
>>
>> Why can't it tell what specialization for X exist?
>
> Because they might not exist yet.

What are you talking about? Either they do or they don't exist. It would
be damn hard for any toll to provide code completion based on code you have
yet to write.

> That's true, but (as a general rule) well-designed template code is also
> the
> most complex code. Code completion outside of template code, while
> useful, is only a small benefit.

There are a lot of darn simple templates in the Standard Library.
 
> Look, if a tool author is willing to fully parse the underlying language,
> preprocessing the source as it does so is trivial in comparison. If you
> disagree, tell me why.

The problem is knowing what needs to be preprocessed. There is also the
problem the the preprocessor does not adhere to scoping rules, so the tool
cannot limit the scope under consideration without ignoring the potential
impact of the preprocessor.

>> But now you are talking about something editing your code at the same
>> time you are, but not displaying the results. What I should have
>> said is that the result of using the Cpp is far less structured than
>> the result of using a programming language.
>
> No, I'm not. C and C++ code can be preprocessed as it is parsed into a
> syntax
> tree in a single pass. It's not like when you type something, the IDE
> goes and
> tries to find it in the source code--that would be *incredibly*
> inefficient. In essence, it is already rewriting the code into an
> internal format that is designed for algorithmic navigation that also
> discards information that it doesn't care about.

Nonetheless, the ast is not going to directly coincide with what is in the
edit buffer. Adding one preprocessor directive can add dozens of source
files to the translation unit. That is unstructured, and unpredictable.

>> I see no reason to try to support a compiler that doesn't understand
>> namespaces. What significant platform is restricted to using such a
>> compiler?
>
> I wish it was that simple. In many companies, there is a lot of inertia
> from a
> compiler version. I.e. it often takes years to upgrade to new compilers
> (if at all)--simply because the time required to make older code
> compatible with the
> new compiler can be massively expensive. Thus, new code gets written for
> older compilers all the time.

I don't believe there is a compelling reason to introduce new libraries
intended for general use in forward looking technology filled with ugly
kludges in order to try to be compatable with the least common denominator.
There are better ways of dealing with such issues. By bending over
backward to try to remain compatable with obsolete technology, you
compromise your own produce and encourage the survival of technology that
was outdate for a reason.
 
>> As I said, there are historical reason for the code to be that way.
>
> There are current reasons as well.

And the result is that people don't use the product nearly as much as they
otherwise would.
 

> The preprocessor does not 'rewrite' code--it expands macros which are
> *part* of
> the code. In doing so, it can reap major benefits in readability and
> maintenance.

I've more often seen the opposite effect. Most use of macros makes code
less comprehendable, and trying to track down the point of definition can
be exasperating.
 
>>> What does this have to do with the C or C++ preprocessor? C++ does
>>> not have a module system. You're talking about fundamental changes
>>> to the language, not problems with the preprocessor.
>>
>> "I suspect my dislike for the preprocessor is well known. Cpp is
>> essential in C programming, and still important in conventional C++
>> implementations, but it is a hack, and so are most of the techniques
>> that rely on it.
>> *_It_has_been_my_long-term_aim_to_make_Cpp_redundant_*."
>
> More than anything else, Bjarne hates #define, not #include, BTW. The
> preprocessor is not a hack, it is an incredibly useful tool that can, like
> every other tool, be misused.

I think its primary use is as a crutch the C++ can't so without because it
was never attempted.
 

>> The separation of interface and implementation is textbook proper
>> C++. I still find the added level of complexity involved in using
>> headers unnecessary, and a significant burden.
>
> What complexity are you talking about exactly? Separation of interface
> and implementation is a cornerstone of separate compilation--which is one
> of the fundamental reasons that C++ is able to scale to very large
> projects without dropping efficiency.

Headers should not be the means of achieving the separation of
implementation and interface. The only place the ISO/IEC 14882:2003 even
mentions a header file is in the C compatability appendix. The language
specification /should/ address this issue by providing a better solution
than that which currently exists.
 
>> I'm currently working
>> on a tool that will mitigate this drudgery by treating the class as a
>> unit, rather than a composit.
>
> What do you mean by 'composite', separation of interface and
> implementation?

My actually having to maintain redundant constructs between these files.
Changing one member in a class can result in having to edit the location
where the member is defined, the parameter list in the constructor
declaration, the parameter list in constructor definition, the member
initialization list, and perhaps parameter lists in both the the header and
source file of any functions involved. Additionally, I am likely to have
to remove a forward declaration and a #include. There may also be a
requirement to modify the destructor.

A header doesn't really give you a genuine interface, and the tradeoff in
trying to make a header devoid of anything but pointer definitions in order
to prevent dependencies can be a nuisance as well. Some of this is
inevitable no matter what. Some of it is beyond repair. Could have,
perhaps should have, been done differently at the outset. Some of the
relative complexity compared to Java is due to the fact that C++ has
pointers, references, and (regular) variables. So my tool, if I ever get
it completed, is intended to mitigate more than could be solved by
eliminating the #include. I favor the falsely advertised feature described
here:

http://gcc.gnu.org/onlinedocs/gcc-3.4.1/gcc/C---Interface.html#C++%20Interface

It don't work!

> Java (and it isn't just Java)
> is smack in the middle of the two approaches, which ends up providing only
> a small
> fraction of the benefits of either. If I want control of details (for
> whatever reason) I'll use a language (like C++) that doesn't actively work
> against me
> controlling those details. If I want a higher-level language where I can
> largely ignore many of those details, I'll use a real higher-level
> language (like Haskell).

It's not the safety that makes Java useful. It is that fact that it
facilitates the location and leveraging of resources. Much of that has to
do with the superior mechanism of importing declarations. Java more
effectively separates interface from implementation than does C++. I
believe some of what Java does could be done in C++ without negatively
impacting the language.
 
>>> Regarding C++ (in general)... C++ is more powerful--that is
>>> unquestionable.
>>
>> What do you mean by powerful?
>
> I mean that it gives you access to lower-level details while
> simultaneously giving you tools to create higher-level abstractions.

I agree that C++ does that. What it doesn't do well is facilitate the
location of resources, and the isolation of components.

>> It seems clear that major players in
>> the industry do not consider C++ to be appropriate for many major
>> applications.
>
> Ha. The opposite is true.

I'm not saying no one is using C++. I am saying that there are major areas
where C++ is not the language of choice, and it is due to the problems I've
been discussing. It's a combination of many issues which individually seem
trivial, but when they are combined become genuine obstacles to making
progress. Sure, with a few years experience people can learn to compensate
for these defects. But a lot of people won't get the luxury of being only
moderately productive for that amount of time.
 

>> He may not agree with me here, and even if he does, he may be
>> unwilling to say as much for the sake of polity. The way the standard
>> headers are used is a logical mess. It causes the programmer to
>> waste valuable time searching for things that should be immediately
>> accessible by name.
>
> That is what documentation is for, which you need anyway.

No /that/ is what _interfaces_ are for! The programming language should be
the primary means of communication between the author and the reader. API
documentation can be very useful, and much of it can be generated by tools
such as JavaDoc and Doxygen.

> (Further, good
> documentation is significantly more involved that what can be trivially
> put in header file comments.)

Sometimes it's nice to have more than just the autogenerated html, but even
that can be quite useful:
http://www.kdevelop.org/HEAD/doc/api/html/classTemplateParameterAST.html

Trolltech generates all their API documentation from the source code:

http://doc.trolltech.com/3.3/qstring.html

And of course the highly successful Java API documentation is likewise
generated directly from the source files:
http://java.sun.com/j2ee/1.4/docs/api/index.html

> As an aside, the standard library header structure
> is not that well organized--mostly because some headers contain too many
> things (e.g. <algorithm>).

And the namespace is flat. There should be one mechanism for determining
the subset of the library responsible for any particular area of
functionality. As it stands you take the intersection of the namespace and
the header name. Since some headers include other headers, you often end
up with more than you specified. That is bad. It can introduce hidden
dependencies.

>> The is true of most other libraries as well.
>> There is no need for this lack of structure other than the simple
>> fact that no one has been able to move the mindset of some C++
>> programmers out of the Nixon era.
>
> An interface is specified in some file, which you include to access the
> interface. The documentation tells you what file you need to include and
> what
> you need to link to (if anything) to use the interface. That's pretty
> simple and well-structured.

An interface should be self-describing, and should not introduce more into
an environment than is essential to serve the immediate purpose.

>> In the case of the CPP, it isn't so much me I want to be protected
>> from. It's the people who think it's a good idea to use it for
>> anything beyond simple conditional compilation, some debugging, and
>> perhaps for #including header files.
>
> Give an example of why this protection is necessary. There is only one:
> name
> conflicts introduced by unprefixed or lowercase macro definitions. That
> is just bad design, and you need a whole lot more changes to the language
> to prevent someone else's bad design from affecting you.

I've already provided an example in the Xerces code. That's a friggin slap
in the face to a person who wants to read that code.
 
>> I will admit that I still find
>> the use of header files bothersome. They represented one of the
>> biggest obstacles I encountered when first learning C++.
>
> They are different than Java, but they aren't fundamentally complex.

No, but they can combine to create very unstructured complexity.

>>> C++ doesn't make such decisions for us, instead it gives us the tools
>>> to make abstractions that do it for us. In other words, it isn't the
>>> result of arrogance propelled by limited vision.
>>
>> Leaving some things to choice serves no useful purpose beyond pleasing
>> multiple constituents.
>
> I'm not talking about providing two or more near identical features that
> all
> have the same tradeoffs. Fundamentally, it's about being able to pick
> which tradeoffs are worthwhile for a particular thing--and that serves an
> incredibly useful purpose.

I'm talking about the simple things like not specifying a file name
extension. Sure, it seems trivial, but it can be a PITA when switching
between tools which default to different conventions, or mixing libraries
that use different conventions. Some tools think .c files are C files and
go into C mode, not C++ mode unless you punch it a couple of times. And
worse is the .h file, because more people are likely to name their C++
header files that way. Others want to call everything .cpp which I find
annoying, but quite common. Others correctly prefer the .cc and .hh
extentions.

>> There are places where a lack of rule is not
>> empowering, it is restricting.
>
> Example?
>
>> I hear diving in Rome is not what one
>> could properly call a civilized affair. Personally, I like the idea
>> that people stop at stoplights, use trunsignals appropriately, stay in
>> one lane under normal circumstances, etc.

> ... However, as a generalization, it is true. C++ gives
> you more flexibility by moving the system from the language to convention
> moreso than Java does.)

The problem as I see it is that the lack of specification of certain things
such as name resolution based on fully qualified identifiers rather than
relying on the programmer to #include the file containing the declaration
is a major structural deficiency in C++. I wish the standard simply said
'given a fully qualified identifier the implementation shall resolve that
name and make the declaration and/or definition available to the compiler
as needed'.

-- 
"[M]y dislike for the preprocessor is well known. Cpp is essential in C
programming, and still important in conventional C++ implementations, but
it is a hack, and so are most of the techniques that rely on it. ...I think
the time has come to be serious about macro-free C++ programming." - B. S.


Relevant Pages

  • Re: casts
    ... I don't revere Nash as a computer science genius or whatever. ... compiler in a later edition of Visual Basic, ... and your front end GUI in any language that makes ... I've worked in Java before. ...
    (comp.lang.c)
  • Re: casts
    ... and was asked to teach it to prospective computer science majors at ... compiler in a later edition of Visual Basic, ... and your front end GUI in any language that makes ... I've worked in Java before. ...
    (comp.lang.c)
  • Re: casts
    ... This is why most shit programmers refuse to learn languages including ... C Sharp and Java. ... compiler in a later edition of Visual Basic, ... language for processing data. ...
    (comp.lang.c)
  • Re: Why this overloading example works this way?
    ... The rules of the Java programming language state that method overloading ... You seem to want the compiler to engage in some kind of reasoning ...
    (comp.lang.java.programmer)
  • Re: Header Files and Interfaces Yet Again
    ... >> The point that I hope to get across is that header files need to provide ... a compiler has to know about them at compile time. ... But he actually uses what you call a fake header. ... differences between Java and C++: In Java all obects appear to be handles. ...
    (comp.lang.cpp)

Loading