Johan Ovlinger
Module systems are considered de-rigeuer for many functional and procedural languages. Object-Oriented languages are less dependent on modules, as they offer facilities - abstract classes, virtual methods - that can be used to simulate much of what a module can offer.
I argue that Object-Oriented languages indeed do require module systems. The plan of study is to research what exact requirements are for such a system and how best to fulfill them.
Programs, as they grow in size, need to be split up into sections if they are to remain manageable. The different sections need to have clearly defined interfaces that abstract away from details of their implementation. The problem this solves is mainly one of assumptions. By limiting to what is described in the interface the assumptions other parts of the program can make about the section, the section is free to vary as long as it maintains the same interface. In a large program without interfaces, it is almost certain that any change to a section would break some assumption some other section of the program has made.
These sections, together with their interfaces, are commonly referred to as modules. At a minimum, modules should support separate compilation of modules and allow a module to be imported (used) by several other program sections simultaneously.
Module systems are important parts of many procedural and functional languages, but seem not to have gained as large a stature in the Object-Oriented world.
A reason for this may be that classes are seen as light-weight modules in their own right, thus reducing the need for a module system. I think this is a fallacious conclusion. Experience shows that while classes are useful for structural decomposition, they are less useful as mechanisms of functional decomposition. Functional aspects often include several cooperating classes, so while parameterized classes are useful for some tasks, they are not applicable to system construction.
What we would like to find is a module system for an Object-Oriented language that allows an interface to be declared separately from its implementation. Several different implementations should be allowed, and the programmer should be able to specify which implementation is desired. To support a more dynamic style of programming, modules should be first class values, so they can be manipulated like all other objects in the system.
The aim of the research is to arrive at a prototype system that combines ease of code reuse with robustness and flexibility. These goals will be attained by making assumptions explicit and allowing widely varying implementations to present the same interface and to be able to link them at runtime.
The Object-Oriented aspect is captured by exporting classes and methods on these classes in the interface. Thus, interfaces capture both data-structures and behavior written over them, allowing both to be loosely coupled to their implementation. This is in contrast to module systems for functional and procedural languages, which typically only capture behavior, and at best allow the structure of data to be abstracted or partially revealed.
As I understand it, the purpose of a this section is to show that I have figured out which area I want to research and have absorbed some of the relevant previous work in the area. During the coming year, I am supposed to flesh this out and find a niche that has not yet been explored.
Much of the work below is not related to modules per-se, but rather aims to provide the same sort of functionality at a different granularity.
One of the more advanced module systems in practical use is that of ML [22,27]. ML's name space construct is the structure. Structures have interface types, signatures, that determine visibility by constraining the structure to be of the type specified by the signature. Finally, functors allow structures to be abstracted over their imports.
Structures can be linked together using functors to build a variety of runtime modules, enabling a powerful form of genericity. However, structures are not first class values. Thus, the programmer cannot use functors to program dynamic systems that adapt to their usage by linking in different implementations.
Given the etymology of its name, Dylan's [6] module system is surprisingly static. Modules form the unit of import and name-space, packaged up into libraries. Libraries pull in several modules' definitions into one scope, and also specify which of the modules are to be exported. There is no support for higher level module operations or dynamic linking.
Namespaces in C++ [5] are mainly meant to avoid name collisions. Because they don't enforce access privileges [6], they are not really useful for program structuring or genericity.
Modula 3's module system [9] has a flexible system for partially specifying types in module interfaces. However, the types revealed must be generalizations of the true type of an object. The module system can only directly be used to omit parts and classes, and retype parts to more general types. This facility is called partial revelation. Other features include allowing modules to satisfy several interfaces at once and an external linking language that allows generic behavior to be programmed.
Eiffel [17] takes a novel view of modularity. Instead of defining named interfaces to classes, each class defines a set of features. Depending on what the client of the class is, a different set of features is presented. Each feature contains methods and instance variables that the client can access.
While useful in the contractual programming style championed by Meyer [16], features cannot feasibly replace modules. Since the visibility of a feature is specified in terms of the importer, the class would have to be updated for each new import.
The focus on individual rather than groups of classes is evident also in Eiffel's generic features; classes can be parameterized over other classes. I argued in the introduction that the class is too small a unit for import; I feel the same way about generics.
Java [8] provides packages as a mechanism to partition the space of class names. Although it appears to be hierarchical- package names use dot notation- this is not the case. No extra abilities are conferred to packages whose names share a common prefix.
Classes in Java hardwire their imports, so link-time genericity is not an option.
Furthermore, while several systems have been proposed [2,19,20,26] Java lacks even class valued parametericity.
For a systems language, Java's module system would appear to be inadequate, but with some tool support it it apparently acceptable [13]
.
Views were transplanted from relational databases to provide uniform access to OO databases with different schemas [1,11,3]. As such views technology focuses on how to dynamically reclassify existing objects. The approach of course assumes that the objects themselves will be unaware of the process.
Due to their background, the OODB concept of views have never been proposed for programming languages (although [25] uses the term, they don't mean the same thing).
There has been significant research into role dependent behavior [14,28,4,25]. Like any good term, role has come to be applied to a variety of concepts. A particularly good overview is given in [28]. The common theme is that the type and the behavior of an object should vary not only statically by its class but also dynamically by its context. The scope of a role is an object, rather than a set of objects.
Subject Oriented Programming [10] uses a compile time program weaver that takes independent units of behavior and weaves them together according to a specification language. Name collisions and other interactions are resolved by specifications in the weaving language. The result of the weaver is a group of classes that transparently include all of the separate behavior.
Each unit of behavior is a separate name-space, with exports and imports controlled by the weaving language. Obviously, being able to specify each [multi-class] behavioral aspect in a single name-space is a big win (hence C++ namespaces) over the standard organization of a class based system, where structural organization dominates, and behavioral units must be pieced together by the reader.
APPCs [18]s were recently introduced by Lieberherr and Mezini to improve on Ian Holland's contracts. Their main contribution is to allow a more adaptive specification of the component's constituent classes. APPCs are composed using a construct modeled on inheritance. Their exposition implies that a renaming translation like that of SOP is intended.
Ian Holland's Contracts and Lens [12] approach is close to SOP's. A contract is a behavior, possibly spread over a group of classes. Runtime lens objects mediate between contracts to make sure that assumptions are maintained and name collisions resolved. However, the technology assumes a very simple use model that severely restricts its usefulness.
In his Ph.D. thesis Mark Lillibridge [15] explores how first class modules interact with partially abstract types. He derives typing rules and proves soundness results.
The system he proposes supports a very dynamic programming style, allowing module implementations to be swapped at runtime. This is very close to the system I am aiming at, where type safety is guaranteed by module interfaces, and the full power of the core programming language is available to the link system.
Linda Seiter [23,24] introduces the Context Pattern, a multi-class variant of the strategy pattern [7] that allows a set of classes to offer context dependent behavior. To ensure type safety, only the implementation of specially marked methods can be varied dynamically. The operational semantics of context objects suggest that dynamic variations are globally visible. If an object is used simultaneously (i.e. in a multi-threaded environment) in different contexts, it is impossible to tell statically which method definition is active.
Context Patterns do not offer the ability to present different views of a class, but rather the ability to have a number of method definitions be updated in a unit. All of the methods that are updated in a group can share common state, which partially relieves the restriction that only method definitions can be updated.
Existing module systems for procedural and functional languages have focussed on providing abstraction of behavior, and have relatively weak support for abstracting over data structures and types. Workarounds exist, in the form of Abstract Data Types, but ADTs require a style of programming that is not always convenient. For example, one cannot use pattern matching on an ADT in ML, which significantly hinders expressibility.
The workarounds necessary to use ADTs are acceptable in the realms of procedural and [to a lesser extent] functional languages, as they tend to model their data in a value-oriented fashion. In a value-oriented model of data, the values stored are more important than the structure they are stored in. Taking this to the logical extreme, one arrives at an ADT, where only accessor-functions manipulate the values, completely hiding the data representation. Thus, it is not surprising that procedural programmers view the restricted interface provided by an ADT as good programming style rather than a restriction.
However, OO languages tend not to model data in a value-oriented manner. Rather, a structure-oriented data model is the natural choice for OO programmers. Significant amounts of information are stored in the interconnections between - and in the types of - the objects in the domain. Thus, the programming style required by the interface to an ADT conflicts with good OO style. OO programs need access to the data representation - or more precisely - a data representation.
Building on recent work [21], I plan to explore the area of OO module systems. In my opinion, module systems are a necessity as classes are too small to be effective units of reuse.
To promote a clean separation of interface and implementation, a flexible functor system will be provided. This system will improve on ML's functor system by providing interfaces to not only behavior but also data-structures. Automatic translation will ensure that each module interface defines its own type kind and that all type kinds are kept distinct.
[21] implies one specific implementation strategy, which although primitive sketches the general direction intended. Several other strategies are possible, the implications of each will be explored.
A major area of research will necessarily be the correctness and safety of the translation layer.
Smaller areas of research will be in improving the functor system to capture common usage patterns. Results from Karl's Demeter project will most likely be applied towards this end.
Earlier module systems for mainstream languages have allowed only a limited form of abstraction, where a data member is either exported or abstract. Research efforts [10,12,18] have concentrated on mainly behavioral aspects of program modules. To the extent that structural issues have been examined, the mechanisms proposed have been unsuitable for composition.
The contribution of this research will be to provide mechanisms to allow modules to export and abstract over data structures. By allowing a module to export a view of its internal class graph that is not limited to a partial revelation of the internal class graph allows the implementor much greater flexibility to fulfill the interface than prior systems have.
Secondary effects of this flexibility - robustness and code reuse - will be examined in greater detail.
I plan to take a minor in systems. To this goal I am applying the following courses:
COM 3200 Computer Architecture COM 3336 Operating Systems COM 3337 Distributed Operating Systems COM 3640 Parallel Algorithms
| Name: | Johan Ovlinger |
| Address: | 10 Magnolia St |
| Arlington 02474, MA | |
| Email: | johan@ccs.neu.edu |
| Undergraduate Degree: | Bachelor of Science |
| College: | Vesalius College, V.U.B. |
| Major: | Computer Science |
| Date: | May 1995 |
| Date Admitted to NU, PhD: | Sept 1996 |
| CCS assistantships held: | Since 1997 |
| Date Qualifier Passed: | Spring 1997 |
| Major Professor: | Karl Lieberherr |
| Major Field of Study: | Programming Languages |
| Thesis Area: | OO Module systems |
| Minor Field: | Systems |
| Date of Study Plan: | Sept 10, 1998 |
| Proposed Comprehensive Plan: | Summer 1999 |
| Projected Thesis Completion Date: | Late 2000 |
Approved by Advisor(signature and date):
Approved by Graduate Committee(date):
This document was generated using the LaTeX2HTML translator Version 97.1 (release) (July 13th, 1997)
Copyright © 1993, 1994, 1995, 1996, 1997, Nikos Drakos, Computer Based Learning Unit, University of Leeds.
The command line arguments were:
latex2html -split 1 planofstudy.tex.
The translation was initiated by Johan Ovlinger on 11/3/1998