Re: dependency-detection in java - Take 2



Andreas Leitgeb wrote:
Michael Jung <miju@xxxxxxxxxxxxx> wrote:
This isn't only about SFFs, but also about classes changing
their interface incompatibly (could be a typo of the developer,
but without compiling the dependents, it may go unnoticed for
a while!)

SFFs should only be used locally,

I'm dreaming of reliable, non-full rebuilds. They shouldn't
depend on developers following guidelines (which all have their
"accepted exceptions", anyway), and shouldn't even depend on
developers contributing correct code (but detect all errors,
even those caused only in dependent classes.)

I'm viewing this problem from a CM point of view (although I'm
rather developer than CM). Any change that gets into the repository
might have been checked in by a monkey, as well as by a senior expert.
The build process shouldn't care. It should in the end say: "yes, the
project has been built", or "it could not be built due to these
errors, and furthermore the build process should do this with minimal
use of processor-ressources (on whatever machine it is applied, be it
developer's workstation, or a dedicated compile-server).

I'm aware, that such a build-tool-chain is either not existing now,
or at least not known to anyone participating in this thread.

I'd like to discuss how this could be done. First, what is
principially possible to do - where are the theoretic limits?
Would dependency-management be necessarily more expensive than
the unconditional full compile?

I've thought about this a bit, though not to the point of creating a design,
much less building prototypes. It seems to me that this approach is worth
investigating:

1. The interface of each class C in the system needs to be captured and
stored persistently, where "interface" means method signatures, field
definitions, and constant values. Superclass name too, to cover the changes
that can occur if what C inherits changes.

2. Dependencies also need to be captured and stored persistently. This will
be information of the form:

. Class D depends on (some feature of) the interface of class C

3. Whenever a class is compiled successfully:

A. Its new interface is constructed.
B. Changes to the previous interface are computed, and all classes dependent
upon something that changed arerked invalid (except for other classes
compiled by the same invocation of javac, of course -- they're up to date
and will be marked valid).
C. Its new dependencies are calculated.
D. Its stored interface and dependencies are updated.
E. It marked valid

4. Whenever a group of Java files is to be rebuilt, e.g. by Ant's <javac>
task, any classes marked invalid in step 3 are recompiled, in addition to
classes that are not up-to-date with their source files.

Notes:
"Class" is used a bit ambiguously above, sometimes to mean a .class file,
sometimes to mean all of the .class files built from a single source file.
Clearly if A$Inner depends upon B, it's A that's marked invalid when B
changes.

The obvious outstanding problems with the above are:

A. How granular should the dependency information be? If it's simply "A
depends on B", there will be a lot of unnecessary recompilations. If it's
"A depends separately on the following 20 method calls it makes to B", there
will be a vast amount of dependency information stored, updated, and
checked. I have no intuition for where the sweet spot lies.

B. How to generate the dependency information. I presume it can be
calculated from .class file analysis, but I haven't verified this in detail,
nor do I know how expensive that would be. It would be awfully nice if
javac would generate it for us, but it doesn't.

C. How to represent the dependency "C calls a method on an instance of D
that's inherited from E". I think all of this information is required: if
the method definition changes, it will change in E, but we also need to mark
C invalid if D is reparented.

Inheritance adds some wrinkles. If Sub overrides a method it inherits from
Super, that doesn't really change its interface. Classes which previously
called Sub.meth() don't have to be recompiled. On the other hand, if Sub
defines a field that hides a field defined in Super, classes that accessed
Super.field should be marked invalid.

Overloads add some more wrinkles. If a new overload is added, methods that
called the existing overloads and might now call the new one need to be
recompiled. For practical purposes, it should be fine to recompiled callers
to any of the previous overloads, even if they're wholly disjoint.

When a class is reparented, it probably makes more sense to mark all of its
dependents invalid, rather than to try to calculate exactly what changed.
Note that changing the interfaces an abstract class implements is a kind of
reparenting, since it can change the set of methods that the class defines.

I'm sure there are many more of these which further analysis would reveal.
One more note: this is an ideal open source project, since it could be
greatly useful to the development community and there is no money to be made
by solving it.


.



Relevant Pages

  • Re: graph of behavior - was Re: State vs. Data (was Re: Fans of Template Method with protected varia
    ... > interface, I can invert the dependency between two packages without ... > In this case package a depends upon package b. ... "Uncle Bob" sycophant terms for using abstract interfaces to concrete ...
    (comp.object)
  • Re: [FYI] Clean Code Developer (dot net pro)
    ... einen Artikel über eine Initiative "Clean Code Developer" ... Bis zu "Dependency Inversion Principle" kann ich noch mitgehen, ... Selbst wenn die Low-Level-Klasse innerhalb eines Interface realisiert ist, werden sich unterschiedliche Implementationen durch Parameter innerhalb des Interface unterscheiden. ... Mir fallen da sofort 3 Problembereiche ein, wo derartige Praktiken kontraproduktiv sind bzw. sein können: Embedded Programmierung, Kommunikationsstacks und Programmierung in sicherheitskritischen Anwendungen. ...
    (de.comp.lang.java)
  • Re: graph of behavior - was Re: State vs. Data (was Re: Fans of Template Method with protected varia
    ... forget that the client user of the abstract server interface STILL has ... RCM places the client in the same package as the abstract server ... dependency, it's a horizontal same level thing. ...
    (comp.object)
  • Re: graph of behavior - was Re: State vs. Data (was Re: Fans of Template Method with protected varia
    ... I could create a reference based on every interface that HashTable ... But again we are talking about subtyping I see no inversion or mention ... since there is an indirect dependency. ...
    (comp.object)
  • Re: UML Association?
    ... [Since you have two referential attributes, I assume that a Developer ... may have multiple owners and multiple creators? ... The only reason to do that would be if there were multiple relationships to navigate. ... I would also point out that dependency relationships in UML are distinct from association and generalization relationships. ...
    (comp.object)