The following is a summary prepared in conjunction with the meeting "Thinking with Diagrams '98"

TwD98 Summary: Understanding Diagrams

Robert P. Futrelle, Northeastern University

 
Type of Diagram(s) Studied
We focus on diagrams from scientific research papers, e.g., x,y data plots. Our goal is to automate the building of structured representations for these diagrams to allow them to be indexed, retrieved, and reasoned about.
 
Representation Medium
We assume that diagrams are available in vector format, i.e., as collections of graphics objects such as lines, polygons, curves, and positioned text. Typical published diagrams contain 100 to 200 objects. The representation of our diagrams, once parsed, is a generalization of the familiar natural language parse tree. Although vehicles such as the World Wide Web have concentrated on raster formats, e.g., GIF, this year (1998) there has been a flurry of proposals for vector formats for the Web, e.g., PGML.
 
Accompanying Non Diagrammatic Representations
The major non-diagrammatic information that accompanies virtually all technical diagrams is text. There is text within diagrams, in captions, and in running text.
 
Context of Diagram Use / Application Domain
Diagrams in science are important for capturing and communicating information. See Bruno Latour's excellent discussion of the role of diagrams in the development of science and technology (Latour, B. (1990). Drawing things together. In M. Lynch & S. Woolgar (Eds.), Representation in Scientific Practice (pp. 19-68). Cambridge, MA: MIT Press.)
 
Diagrammatic Research Question / Issue
We are continuing to refine our successful work on diagram parsing. So far, we have grammars for x,y data plots, gene diagrams, and state machines. We use context-based constraint grammars (Futrelle, R. P. and Nikolakis, N. (1995). Efficient Analysis of Complex Diagrams using Constraint-Based Parsing. In ICDAR-95 (Intl. Conf. on Document Analysis & Recognition), (pp. 782-790). Montreal, Canada.) For a recent review on topics of visual language parsing and related topics, see Marriott, K. and Meyer, B. E. (Ed.). (1998). Visual Language Theory: Springer Verlag.
 
Discipline(s)
All this work is done in the author's Biological Knowledge Laboratory, College of Computer Science, Northeastern University, Boston, MA, which is devoted to artificial intelligence research on the foundations of scientific knowledge.
 
Research Approach / Methodology
We have yet to do experimental cognitive studies.
 
Main Diagrammatic Outcomes: Findings / Theories / Principles
All automated reasoning about diagrams has to have some symbolic representation of diagram structure. That is why we are so focused on diagram parsing/understanding. Our parsing system is retargetable and efficient.
 
Links and References
My home page: http://www.ccs.neu.edu/home/futrelle/. Email me: futrelle@ccs.neu.edu. My doctoral student, Nikos Nikolakis has posted a number of our papers: http://www.ccs.neu.edu/home/nikos/papers-index.html
 
Other Information
A forthcoming chapter explores a new area that is a challenge for reasoning about diagrams: How could we develop automated approaches to reducing a diagram to its essence, producing a summary or abstract of it which is simpler than the original? See Futrelle, R. P. (1999). Summarization of Diagrams in Documents. In I. Mani & M. Maybury (Eds.), Advances in Automated Text Summarization . Cambridge, MA: MIT Press.