Re: [OT] Re: How do you read source for big programs?

From: Joe Wright (joewwright_at_comcast.net)
Date: 11/30/04


Date: Mon, 29 Nov 2004 19:22:24 -0500

Eric Sosman wrote:
> kj wrote:
>
>>I consider myself quite proficient in C and a few other programming
>>languages, but I have never succeeded in understanding a largish
>>program (such as zsh or ncurses) at the source level. Basically,
>>I quickly become disoriented, losing sight of the forest for the
>>trees.
>>
>>What's your approach for understanding a large program at the source
>>level? By "understanding a program" I mean more than just figuring
>>out where to zero in to make a small change (e.g. change the value
>>of a global variable), but rather to digest as much of the source
>>as necessary to know the program's structure in detail, know where
>>in the source to go for any customization you'd want to make, know
>>what you'd need to do to port the program to a different OS from
>>the one it was written for, know what you'd need to do to abstract
>>some of the program's functionality into a smaller subprogram that
>>you could embed in a program of your own, etc. Bottom line: the
>>goal is to know the program's source inside out.
>>
>>I realize that this is a task that could take days, if not weeks.
>>I'm willing to put in the effort, but I'm really at a loss as to
>>how to proceed. (My current interest is reading the source for
>>zsh, but some day I'd like to read the source codes for Perl, Emacs,
>>Firefox, Apache, you name it.)
>
>
> Even a relatively small program of a few hundred
> thousand lines is too complex to grasp "in detail," and
> understanding the entirety of medium and large programs
> requires shortcuts. The largest single program I ever
> personally worked on had grown to about three million
> lines by the end of my eleven years on it, and although
> I was expert on certain parts of it and had a rough idea
> what the other parts were about, there is no way that I
> could ever pretend to understand the entire thing "in
> detail." And three million lines really isn't that large;
> the program I mention had its origins in the early 1980s,
> and bloat-- er, I mean, "scale" -- has grown in the last
> quarter century.
>
> Here's another matter: If the program is "interesting,"
> somebody out there is making changes to it. If it's both
> interesting and large, there'll be several such somebodys.
> By the time you finish your study of the code (assuming
> you can do so), those somebodys will have added two brand-
> new subsystems while ripping out and re-implementing three
> existing subsystems from scratch. Your knowledge will be
> obsolete before you can finish acquiring it.
>
> What to do? I think one must recognize the futility of
> the pursuit of perfect knowledge; very few programmers are
> actually God (despite what they think of themselves). One
> must instead seek to accomplish a goal -- fix a bug, add
> a feature, whatever -- *without* needing to gather complete
> knowledge as a prerequisite. When you are dropped in the
> middle of the Pacific, you don't need a detailed cartography
> of the Earth's entire land mass, but you do need to find
> yourself an island within swimming distance. That's the
> skill a programmer should cultivate: When dropped into the
> middle of a huge sea of mysterious code, to find a few
> islands and start building bridges between them. Aim for
> a network of knowledge, not for a blanket.
>
> As a practical means to start developing your network,
> your archipelago of islets in a sea of confusion, I can
> recommend that you port the program to a new environment.
> Port Perl to the Palm Pilot, get Emacs running on your Tivo
> box, re-target gcc to the Analytical Engine, whatever you
> like. The exercise will teach you a tremendous amount, not
> the least of which will be a sense for what kinds of practices
> help or hinder the porting, make the program more or less
> robust in the face of other incompletely-aware programmers'
> changes, and so on.
>
> IMHO, the education of programmers concentrates entirely
> too much on the design and generation of programs, and not
> enough on the analysis and understanding of programs already
> written. If you're going to be an effective programmer, you
> must learn these skills for yourself.
>

If we could give awards here for quality responses, Eric is my
nominee hands down, especially for this one. I couldn't have said it
better and I won't try.

-- 
Joe Wright                            mailto:joewwright@comcast.net
"Everything should be made as simple as possible, but not simpler."
                     --- Albert Einstein ---


Relevant Pages

  • [OT] Re: How do you read source for big programs?
    ... > What's your approach for understanding a large program at the source ... skill a programmer should cultivate: ... your archipelago of islets in a sea of confusion, ... recommend that you port the program to a new environment. ...
    (comp.lang.c)
  • Re: safe syscal?
    ... Eric Sosman wrote: ... my understanding is that the kinds ... of attacks used against systemcan't really be fixed by ... related services, and what you can do about them. ...
    (comp.unix.programmer)
  • Re: How well do you know the odds?
    ... > Marilyn Vos Savant's famous column, and any number of other places that ... > particular problem switching can't lose when there is no restricted ... to accept it regardless of understanding... ... Justin C, by the sea. ...
    (rec.gambling.poker)
  • Re: Re: error message 0x800ccc7F
    ... > establish a SSL connection ... > to temporary reason', port ...
    (microsoft.public.windows.inetexplorer.ie6_outlookexpress)
  • Re: Port scan from Apache?
    ... complaining admin. ... need to be met for the NetScreen hw/sw to classify something as a port ... complaint to u but instead become a solid ground to understanding ... the sender dropping this sneaky habbits in freebsd-security list. ...
    (FreeBSD-Security)