Freshman Honors, Sp04 - Multimedia and Diagrams - Prof. Futrelle

Professor Futrelle - College of Computer and Information Sciences, Northeastern U., Boston, MA

For the February 24th class

Version of 6 March 2004

An extensive new page about the work of an Honors student

This page discusses the work of Dan Crispell, a former Honors student here who's now in the PhD program at Brown University. He worked on turning diagrams from the literature into structured objects that we can then use for analysis of the content of diagrams. (Added March 6th)

Multimedia = TV, movies, pictures, diagrams, and more

TV and the movies are the glamorous face of much that we see around us. But to be realistic, most of us will make our contributions and achieve our success in less than starring roles. Few of us will end up in the NBA, directing an Oscar-winning movie or having a Platinum album. Short of this, there are many challenging and important jobs to do. Take health care and biotechnology. These are multibillion dollar industries that use computers in myriad ways. The important and life-saving results discovered and developed by biomedical scientists are typically reported in the biological research literature. They are the places where the latest and greatest medical discoveries are announced, and then reported in the newspaper and TV news.

Your assignment

See this page for details.

Figures and diagrams are the language of science

Though we think of the results of science and medicine as being written down in text, natural language as we call it, much of what is presented is in the form of figures, both diagrams and photos. In fact, an analysis of papers in the major biology journals shows that a full 50% of every paper is devoted to figures! I arrived at this figure by adding the amount of space taken by the figures, the words in the caption text and the portions of the non-caption text devoted to discussing the content of the figures. 50% is a lot. It means that major search engines such as Pubmed ( and Google miss an enormous amount of critically important information.

The problems of making the important information in figures and graphics into "first-class" information that can be searched for and retrieved is the major topic of this talk. I will discuss at some length the work done by an undergraduate, Dan Crispell, to make millions of scientific diagrams available for analysis, indexing, retrieval and interaction.

Here's a page from a Stanford lab that's full of cool Biology diagrams. Click on the images to get large (huge!) versions of them.

Important topics that I'll also discuss:

Return to Prof. Futrelle's Sp04 Honors homepage or his Teaching Gateway or homepage