Thinking with Diagrams 98 - Summary of RPF's work

The following is a summary prepared in conjunction with the meeting "Thinking with Diagrams '98"

TwD98 Summary: Understanding Diagrams

Robert P. Futrelle, Northeastern University


Type of Diagram(s) Studied: We focus on diagrams from scientific research papers, e.g., x,y data plots. Our goal is to automate the building of structured representations for these diagrams to allow them to be indexed, retrieved, and reasoned about.

Representation Medium: We assume that diagrams are available in vector format, i.e., as collections of graphics objects such as lines, polygons, curves, and positioned text. Typical published diagrams contain 100 to 200 objects. The representation of our diagrams, once parsed, is a generalization of the familiar natural language parse tree. Although vehicles such as the World Wide Web have concentrated on raster formats, e.g., GIF, this year (1998) there has been a flurry of proposals for vector formats for the Web, e.g., PGML.

Accompanying Non Diagrammatic Representations: The major non-diagrammatic information that accompanies virtually all technical diagrams is text. There is text within diagrams, in captions, and in running text.

Context of Diagram Use / Application Domain: Diagrams in science are important for capturing and communicating information. See Bruno Latour's excellent discussion of the role of diagrams in the development of science and technology (Latour, B. (1990). Drawing things together. In M. Lynch & S. Woolgar (Eds.), Representation in Scientific Practice (pp. 19-68). Cambridge, MA: MIT Press.)

Diagrammatic Research Question / Issue: We are continuing to refine our successful work on diagram parsing. So far, we have grammars for x,y data plots, gene diagrams, and state machines. We use context-based constraint grammars (Futrelle, R. P. and Nikolakis, N. (1995). Efficient Analysis of Complex Diagrams using Constraint-Based Parsing. In ICDAR-95 (Intl. Conf. on Document Analysis & Recognition), (pp. 782-790). Montreal, Canada.) For a recent review on topics of visual language parsing and related topics, see Marriott, K. and Meyer, B. E. (Ed.). (1998). Visual Language Theory: Springer Verlag.

Discipline(s): All this work is done in the author's Biological Knowledge Laboratory, College of Computer Science, Northeastern University, Boston, MA, which is devoted to artificial intelligence research on the foundations of scientific knowledge.

Research Approach / Methodology: We have yet to do experimental cognitive studies.

Main Diagrammatic Outcomes: Findings / Theories / Principles: All automated reasoning about diagrams has to have some symbolic representation of diagram structure. That is why we are so focused on diagram parsing/understanding. Our parsing system is retargetable and efficient.

Links and References: My home page: http://www.ccs.neu.edu/home/futrelle/. Email me: futrelle@ccs.neu.edu. My doctoral student, Nikos Nikolakis has posted a number of our papers: http://www.ccs.neu.edu/home/nikos/papers-index.html

Other Information: A forthcoming chapter explores a new area that is a challenge for reasoning about diagrams: How could we develop automated approaches to reducing a diagram to its essence, producing a summary or abstract of it which is simpler than the original? See Futrelle, R. P. (1999). Summarization of Diagrams in Documents. In I. Mani & M. Maybury (Eds.), Advances in Automated Text Summarization . Cambridge, MA: MIT Press.