Thoughts on SVP, by Bob Futrelle, 10/4/2003

Let's review the key ideas of the original and still basic ideas behind the Strategic Vectorization Project. Only some of the ideas were implmenented so far and only in rather specific ways. Recently, my tests showed that the strategies so far need additions and changes if SVP is to achieve its full potential. (We might call what the SVP does SVA, for Strategic Vectorization Analysis, or just SV for Strategic Vectorization.)

SV as I originally conceived it, has some powerful notions. The two most important are:

1. Models of objects to be found, e.g., lines as black cores with white wings.

2. Searching for "clean" portions of structures and extending the searches from those portions.

What we've done so far (mainly what *Dan* has done ;-) is to use PCAs to find clean portions of lines and work from there, finding trouble spots and applying specific approaches to them, including the little fix I did of the process recently.

Broader scenarios

But let's imagine some broadened scenarios that might deal more effectively with the problems we've found lately. These scenarios may initially involve more CPU time, but that's of little consequence for now.

A lot of the ideas below involve some type of adaptive computation, in which we gather information at various points and let that affect the later analyses. This is something we've discussed in the past, but never seriously pursued.

Noisy images -- Not central

First, I'll say that we should not focus too much on very degraded, noisy images. That's simply not our job. Our job is not to go into dusty attics and retrieve ancient and barely legible hardcopies of diagrams and vectorize them. Our job is to deal with the millions of gifs and jpegs that are published every year. The ASM pubs for example have electronic journals back to 1995. That's quite enough for us to concentrate on.

That said, I will toss in a few thoughts on noisy images: Local statistics would tell us that there is a high noise level, esp. salt and pepper noise. Once that's known, parameters for PDAs could be adjusted, e.g., using a larger window. Once even a single segment of a noisy line was found, it could be used to shape models of lines to be searched for elsewhere.

Gathering initial statistics for later guidance

In our standard corpus of not-so-noisy images, gathering statistics could be quite useful. For example, histograms could tell us if there are gray or colored sections of the images. The image could be scanned on horizontal, vertical and both 45 degree directions and the width of black (or gray or whatever) runs on those scans could be used to give us a quick estimate of line widths. In fact, the narrower of the widths found would be the ones we want, because any line scanned at other than perpendicular will appear wider than it actually is. There are fancier ways to find width distributions, but four scans should do it.

Doing PCAs with various box sizes

Doing PCA analyses at a set of different box sizes and even shapes might be helpful. Large boxes miss small segments and overcome noise, small boxes find small segments but are more noise sensitive.

One wing and no-wing analyses up front

Using one-wing or no-wing (core only) up front might help. That is, don't wait until there are problems and then reduce the model. Just try, say, a black core as the first step. Then we wouldn't have to try to join trouble spots. PCAs get confused near trouble spots so they have to be ignored, which "pushes back" the clean lines away from the trouble spots making it necessary to bridge larger gaps. A pure core or one-wing model could just plow through.

Adjusting line orientations for optimal fits

Not letting PCAs be the sole determinant of the orientation of the line model to be fitted. This is an important point. The line models are great. They're a powerful idea. But if they're only allowed to be created in orientations determined by PCAs there can be problems. So a (computationally more expensive idea) might be to allow not only the line width to be adjusted but to allow the line orientation to be adjusted also, doing a simultaneous search in the space of the two edges and the orientation. Given all the machinery that Dan has created so far, this is not at all difficult to do. There are two ways this might be done. One is to search off the initial orientation by fixed angles. Another is to let the PCA return some information, based on some estimate of its uncertainty, as to how much variation in the orientation should be examined.

Merging PCAs -- Any changes needed?

The current approach to merging similar PCAs is just one way to do it. Refinements, variants or alternatives to the current approach to cut down the large number of PCAs generated should be thought through and tried, if it's determined that changes might help the process. Trying various line orientations might be enough, so that no changes are needed in PCA merging.

Retain uncertainties in objects created

In general, when any object is created, in addition to its parameters, some indication of the uncertainty of those parameters could be retained, allowing downstream algorithms to act on that additional information when that is helpful.

OCR

When the SVP vs. full image differences are shown, they're dominated by text and some other odds and ends. We need to face up to the fact that text is a biggy and must be dealt with. No thoughts for here at the moment. Clara OCR may not be an option, though it is free -- in C but especially Windows-centric (little-endian). And JavaOCR is commercial (and I've sent them mail about academic license). There are a bunch grad students who want to work on image processing -- not sure if I could get them interested in the SVP. Worth a try.

PS: I need to post some of the those lovely colored segment shots in our SVP webspace.


Return to SVP homepage.