Inventor's Paradox (IP)


Simplify by solving a more general problem. Applied to programming this means to simplify by writing a more general program [ Adaptive Programming Book ]. Split the program into several loosely-coupled building blocks or aspects. Avoid unnecessary spreading or duplication of information in program building blocks to improve understandability and to reduce size of programs. Use appropriate filters to select what is relevant to one building block from other building blocks. (This pattern is not about any kind of abstraction mechanism in programming but it focusses on those abstraction mechanisms which significantly simplify the solution by solving a much more general problem.)

The Inventor's Paradox in Mathematics. I find the Octahedron example illuminating. Feedback from Gregor Kiczales


Solve a concrete problem by solving a more general problem. The general problem has paradoxically a simpler solution. But you have to invent an appropriate general problem which covers your concrete problem.

The generalization is done by splitting the concrete problem into (at least) two collaborating problem aspects PA1 and PA2 with matching solution aspects SA1 and SA2 so that the solution to the original problem can be stated as a pair of collaborating partial solutions (s1,s2), where s1 in SA1 and s2 in SA2. The generalization of the original problem is done by asking for a solution s1 in SA1 which works with a family fSA2 of elements in SA2 such that s2 is in the family fSA2.

The two solution aspects SA1 and SA2 refer to common information in such a way that for a solution (s1,s2), element s1 does not reveal too much information about s2. This property allows one s1 to work with a subset of SA2. The information hiding between the aspects leads to a loose coupling between them.

The loose coupling is achieved by using filters in s1 which define what is important about elements in SA2. The solution in s1 is then formulated using this important information only. The slogan is: focus on what is essential from the other aspect and filter out the noise. Solution s1 is expressed in terms of constraints which define a family of elements in SA2 with which s1 results in a solution of the generalized problem.

The formulation above is for problem solving in general. Applied to programming, it means that we write highly generic programs by splitting them into at least two parts, each one addressing one aspect. The partial program for one aspect will solve the problem for a family of elements of the other aspect; that is why it is highly generic.

"Avoid overspecification in your programs" is another way to summarize the purpose. Programs are decomposed into suitable cross-cutting building blocks, i.e. a building block affects many parts of the program. Building blocks which contain redundant information are overspecified and need to be reduced to a description which removes the overspecification. This leads to a solution of a more general problem while at the same time removing the redundancy from the program. The reduced building blocks need to be compiled together into the original or an improved form of the original program. An important issue is how the reduced descriptions collaborate. They need to refer to a common vocabulary so that the compiler can regenerate the original program.

It should be noted that IP is not only about polymorphic programs of the usual kind but about a much larger class of polymorphism which allows wider variation.

Also known as

Polya's Inventors Paradox (named after the mathematician George Polya), Aspect-Oriented Programming , Separation-of-concerns (Mehmet Aksit).


The IP pattern (applied to programming) can be summarized by [see also: OOPSLA-ADAPT-workshop]:
                ABSTRACT PROGRAM
		aspect element s1 in language of aspect SA1 
		aspect element s2 in language of aspect SA2
		program s1 is expressed in terms of constraints
		which define the family of elements of SA2
		with which s1 can work

     ^                                      |
     |                                      |
     |                                      |
generalize				customize
     |                                      |
     |                                      |
     |                                      V

		P = combine(s1, s2)
Each aspect deals with one issue of the application. Each si plays the role of a customizer for the other concern. Customization is done by combining the programs for the different aspects. The generalization should be based on some regularity or redundancy in the concrete solution. The pattern is more powerful if a customizer si is reused many times during the customization process. This leads to elimination of redundancy. But also when the information in one aspect element s1 is spread into non-contiguous parts of the concrete program P, the pattern is useful.

An aspectual decomposition should satisfy the following property: GENERALIZATION FACTOR: the spreading and/or duplication factor of an aspect after customization is "significant". The spreading factor measures how far apart the information in an aspect element is scattered in the customized program and the duplication factor measures how often information in an aspect element is reused in the customized program.

A high-level theme behind IP seems to be the issue of intension versus extension known from logic. We need to deliver a program which requires a set S in its construction. We can explicitly enumerate the elements of S to define the set S and build the program manually. In other words, we define S by extension.

n alternative is to define the set S by intension. This requires that we provide some function f with domain U so that for some u in U: f(u)=S. Function f is selected based on the expected evolution path of the program which uses S. We hope to choose f such that for the next S' there is a u' so that f(u')=S'. In other words, f captures the persistent parts of the program which are unlikely to change. We also select f such that the description of f is much shorter than the size of set S. This way we can leverage a short description of the persistent parts of the program many times as we supply different arguments to create the transient versions.

Above we talk about a set S. But we can view S to be any mathematical structure which is needed for program construction. Intension-oriented programmming leads naturally to at least two different aspects: the domain of f is one set of aspect elements and the range of f is another set of elements of a different aspect.

ntensional definitions are often shorter than their extensional counterparts. The extensional description of the set of natural numbers 3,4,5,6, ... 100 is much longer than the intensional description: f(n) = the set of numbers x greater than 2 and x less than n. f(101) defines the desired set. The same happens in programming: A set of 100 classes can often be captured by a much shorter traversal specification. Or an object graph containing 100 nodes can sometimes be captured by a short sentence.

o we know at least four kinds of successful intensional definitions in programming: succinct subgraph specifications, parsing, context objects, and synchronization patterns which we discuss in turn below: define subgraphs by traversal specifications (class graphs are arguments) and define object graphs by sentences (grammars are arguments), define graph decorations (code) by listing changed vertices and edges (class graphs are arguments), define program decorations with synchronization code attached to relevant classes and edges (sequential programs are arguments).

But IP is about many other kinds of intensional definitions.


Ideally we would like:
|s1|+|s2| << |P|
but the generalization can be useful in other cases since it leads to better program understanding and maintainability.

We can define the reuse/duplication factor of s1: How much of s1 is duplicated in P = combine(s1, s2)?

We can define a spreading factor for s1: How far is the information in s1 spread out in P?


Below we use the word "aspect" as a synonym for building block. It is often suitable to split an aspect into several aspect slices. For example, the behavioral aspect we may slice into the subtasks which collectively make up the behavior. We find that issues of noise filtering are fundamental to good programming using IP. It is often the case that much of the information in one aspect is just noise to the aspect slices of another aspect. We need filters in each aspect slice to extract what is relevant from the other aspects. Each aspect slice usually has an individual view of the other aspects and therefore needs its own filters to focus on what is important.

In other words, aspectual programming abstracts in two ways: it not only abstracts the aspects from the target program but it also abstracts the aspects from each other (using filters).

Focussing one aspect slice only on the essential parts of another aspect leads to loose coupling between aspects and has several reuse benefits. Consider two aspects A1 and A2 and assume that A1 is partitioned intos slices A11, ... ,A1n and that each slice uses one filter on A2 (this is a simplification).

A11		A12		...  A1n
filter1(A2) 	filter2(A2)	...  filtern(A2)
For a transformation T2(A2) it is often the case that combine(A1,A2) = combine(A1,T2(A2)) for inputs which are legal for both programs. In other words, the change in aspect A2 has no impact on aspect A1. The reason is that the filters still work properly also for the modified A2. In other situations, some of the filters in aspect A1 might need a small update.


Examples of good uses of IP

Adaptive programs using succinct traversal specifications [ MOP for Evolution ], adaptive parameter passing: the same customizer is used many times: for each traversal specification. This eliminates a lot of redundancy. In other words:

aspect                          = behavior
slice of behavior aspect        = subtask
filters                         = traversal patterns
apsect containing noise         = class graph

Context objects avoid the spreading of related behavior-modifying-information over many program parts [ Collaborative Behavior Modification ]. If context objects are reused, they also avoid duplication.

Adaptive programming example with succinct traversals [ Adaptive Programming Book ]:

                ABSTRACT PROGRAM
		P1 = AdaptiveProgram, P2 = ClassGraph, P3 = Syntax
     ^                                      |
     |                                      |
     |                                      |
generalize				customize
     |                                      |
     |                                      |
     |                                      V

		P = combine(P1, P2, P3, W1, W2)
		W = combination instructions
		W1= rename classes used in P1 to
			   classes used in P2
                W2= define where Syntax fits into class graph

		combining is done by propagate plus renaming plus
		syntax insertion.

We have: combine(ClassGraph,Syntax,W2) = ClassDictionary
	 P = combine(AdaptiveProgram, ClassDictionary, W1)
i.e., the combining can be done incrementally.
Is combine(ClassGraph,Syntax,W2) = ClassDictionary a good use of IP?


A = "a" B "," C.
B = "b".
C = "c".


A = B C.
B = .
C = .

Combining instructions W2:

A start "a"
  B after ",",
B start "b",
C start "c"

Here |P1| + |P2| + |W2| > |P|
Is this useful? No, since there is no spreading and no duplication. Separation of behavior and structure is good. But separation of structure and syntax does not have enough benefits, unless a good graphical interface is used.

Example of a bad use of IP

The concrete program is a Pascal program without procedures, the abstract program contains procedure calls and procedure definitions. Combining means inlining (of non-recursive procedures). This is too simple a use of the IP pattern since no generalization of the original problem takes place. The procedure code is not spread into non-contiguous parts of the original program; it is essentially copied into the original program. Other uses of IP build on more sophisticated and powerful abstractions. This example shows a good generalization but not a good use of IP since no filters are used.

Compiler pragmas, written in a separate file (e.g. with line number) is a bad "generalization" since it is easier to put the pragmas directly into the source code.


Proper use of the pattern leads to more understandable and maintainable programs. Information in P1 (for example) may be spread into many program parts after combining which would make the concrete program harder to understand. Even better, information in P1 may be duplicated many times in the concrete program. A consequence of the pattern is that related information is kept together and not duplicated.


There are many different ways to implement the pattern. Since the pattern is rather general, the implementation space consists of many techniques:
expressing inventions:
succinct subgraph specifications
context classes and objects

combining building blocks:
computing propagation graphs
finite automata construction and minimization
virtual context tables
renaming, substitution

An important issue in the implementation is whether the combination of the building blocks creates unintended behavior. There is a tradeoff between the complexity of the combination algorithm and the expressiveness of the abstraction/customization mechanism [ Efficient Implementation of AP ].

Related Patterns

Structure-shy Object

Known Uses

Critique: Is IP too general?

We can apply any concept to simple as well as more complex situations. IP is about mega-templates but of course an ordinary template is also a mega-template.

The contribution of IP is to enlarge the granularity of template parameters. For example, we can not only have variables or classes as parameters but even the class graph of the application as actual parameter to a program. Or we can have synchronization information as actual parameter to a sequential program (see synchronization patterns).

The IP slogan "generalize-solve-customize" is certainly widely used. I think that IP proposes specific invention techniques within the invention process promoted by Polya's Inventor's Paradox. IP proposes to 1. invent aspects containing "complex" elements; 2. write elements of one aspect not for specific other aspect elements but for constraints/filters which define families of aspect elements; 3. measure the spreading/duplication factor of an aspect after customization and favor aspectual decompositions with large spreading/duplication factor. To summarize: IP = Polya's Inventor Paradox applied to programming + guidelines to do the inventions.


AUTHOR = "Karl Lieberherr",
TITLE = "Workshop on Adaptable and Adaptive Software",
BOOKTITLE = "OOPSLA '95, Addendum to the proceedings",
YEAR = "1995",
ADDRESS = "Austin, Texas",
PAGES = "149-154",
EDITOR = "S. Bilow and P. Bilow",
OOPSLA '95 workshop (appears in the addendum to the OOPSLA '96 proceedings)
Aspect-oriented programming (AOP), Xerox PARC, Gregor Kiczales, John Lamping,
Crista Lopes et al. We are developing a joint definition for AOP
or whatever we call it.


AUTHOR = "Cristina Videira Lopes and Karl J. Lieberherr",
TITLE = "Abstracting Process-to-Function Relations in Concurrent
Object-Oriented Applications",
BOOKTITLE = ecoop,
YEAR = "1994",
ADDRESS = "Bologna, Italy",
PAGES = "81-99",
EDITOR = "Remo Pareschi and Mario Tokoro",
ECOOP '94 paper by Lopes et. al. 

AUTHOR = "U. Kastens and W. M. Waite",
TITLE = "Modularity and reusability in attribute grammars",
JOURNAL = "Acta Informatica",
YEAR = 1994,
PAGES = "601-627",
MONTH = "",
VOLUME = 31,

Back to AP home page

Karl J. Lieberherr
College of Computer Science, Northeastern University
Cullinane Hall, Boston, MA 02115
Fax: (617) 373 5121