[OT] Re: How do you read source for big programs?
From: Eric Sosman (eric.sosman_at_sun.com)
Date: 11/29/04
- Next message: Jack Blue: "Effective Email Marketing"
- Previous message: Chris Croughton: "Re: Unix C programming for finding file"
- In reply to: kj: "How do you read source for big programs?"
- Next in thread: Joe Wright: "Re: [OT] Re: How do you read source for big programs?"
- Reply: Joe Wright: "Re: [OT] Re: How do you read source for big programs?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Date: Mon, 29 Nov 2004 15:11:44 -0500
kj wrote:
> I consider myself quite proficient in C and a few other programming
> languages, but I have never succeeded in understanding a largish
> program (such as zsh or ncurses) at the source level. Basically,
> I quickly become disoriented, losing sight of the forest for the
> trees.
>
> What's your approach for understanding a large program at the source
> level? By "understanding a program" I mean more than just figuring
> out where to zero in to make a small change (e.g. change the value
> of a global variable), but rather to digest as much of the source
> as necessary to know the program's structure in detail, know where
> in the source to go for any customization you'd want to make, know
> what you'd need to do to port the program to a different OS from
> the one it was written for, know what you'd need to do to abstract
> some of the program's functionality into a smaller subprogram that
> you could embed in a program of your own, etc. Bottom line: the
> goal is to know the program's source inside out.
>
> I realize that this is a task that could take days, if not weeks.
> I'm willing to put in the effort, but I'm really at a loss as to
> how to proceed. (My current interest is reading the source for
> zsh, but some day I'd like to read the source codes for Perl, Emacs,
> Firefox, Apache, you name it.)
Even a relatively small program of a few hundred
thousand lines is too complex to grasp "in detail," and
understanding the entirety of medium and large programs
requires shortcuts. The largest single program I ever
personally worked on had grown to about three million
lines by the end of my eleven years on it, and although
I was expert on certain parts of it and had a rough idea
what the other parts were about, there is no way that I
could ever pretend to understand the entire thing "in
detail." And three million lines really isn't that large;
the program I mention had its origins in the early 1980s,
and bloat-- er, I mean, "scale" -- has grown in the last
quarter century.
Here's another matter: If the program is "interesting,"
somebody out there is making changes to it. If it's both
interesting and large, there'll be several such somebodys.
By the time you finish your study of the code (assuming
you can do so), those somebodys will have added two brand-
new subsystems while ripping out and re-implementing three
existing subsystems from scratch. Your knowledge will be
obsolete before you can finish acquiring it.
What to do? I think one must recognize the futility of
the pursuit of perfect knowledge; very few programmers are
actually God (despite what they think of themselves). One
must instead seek to accomplish a goal -- fix a bug, add
a feature, whatever -- *without* needing to gather complete
knowledge as a prerequisite. When you are dropped in the
middle of the Pacific, you don't need a detailed cartography
of the Earth's entire land mass, but you do need to find
yourself an island within swimming distance. That's the
skill a programmer should cultivate: When dropped into the
middle of a huge sea of mysterious code, to find a few
islands and start building bridges between them. Aim for
a network of knowledge, not for a blanket.
As a practical means to start developing your network,
your archipelago of islets in a sea of confusion, I can
recommend that you port the program to a new environment.
Port Perl to the Palm Pilot, get Emacs running on your Tivo
box, re-target gcc to the Analytical Engine, whatever you
like. The exercise will teach you a tremendous amount, not
the least of which will be a sense for what kinds of practices
help or hinder the porting, make the program more or less
robust in the face of other incompletely-aware programmers'
changes, and so on.
IMHO, the education of programmers concentrates entirely
too much on the design and generation of programs, and not
enough on the analysis and understanding of programs already
written. If you're going to be an effective programmer, you
must learn these skills for yourself.
-- Eric.Sosman@sun.com
- Next message: Jack Blue: "Effective Email Marketing"
- Previous message: Chris Croughton: "Re: Unix C programming for finding file"
- In reply to: kj: "How do you read source for big programs?"
- Next in thread: Joe Wright: "Re: [OT] Re: How do you read source for big programs?"
- Reply: Joe Wright: "Re: [OT] Re: How do you read source for big programs?"
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Relevant Pages
|