Summary of papers
Mingyan Shao
 by March. 17, 2004

Edge detection algorithm:

Edge Measures Using Similarity Regions  or a local copy by  
Maneesh Kumar Singh (2003 PhD) and Narendra Ahuja (UIUC)

Location- and Density-Based Hierarchical Clustering Using Similarity Analysis  by Peter Bajcsy(1999 PhD) and Narendra Ahuja (UIUC)



Classical algorithms:


29. Feature analysis using line sweep thinning algorithm
Fu Chang; Ya-Ching Lu; Pavlidis, T.
Page(s): 145-158
[Abstract]  [PDF Full-Text (376 KB)]   
A related one:

28.A Thinning Algorithm by contour generation

27. An Integrated Line Tracking and Vectorization Algorithm

25. A vectoziation algorithm of closed regions in raster images
@misc{ fernando-vectorization,
author = "Raster Images Fernando",
title = "A Vectorization Algorithm of Closed Regions in",
url = "citeseer.ist.psu.edu/557642.html" }

Line image vectorization based on shape partitioning and merging
Ju Jia Zou; Hong Yan;
Pattern Recognition, 2000. Proceedings. 15th International Conference on , Volume: 3 , 3-7 Sept. 2000
Pages:994 - 997 vol.3

[Abstract]   [PDF Full-Text (340 KB)]    IEEE CNF
 Mingyan:  This paper works on line images. The authors think skeletonization has two subsets: direct and indirect. The previous one take the image as a whole, the latter one concentrate parts of the images. This papers follow the second approach. A common problem of the skeletonization is that the generate skeleton may not agree with people's perceptionso the original shapes, e.g., an intersection is often split into two or more junciton points. This paper first partition the image into non-overlapping triangles, then merge the skeleton parts together. When they deal with intersections, they assume that a branch of line in the intersection will not change its direction after entering the intersection. Its results look good.


Line net global vectorization: an algorithm and its performance evaluation
Jiqiang Song; Feng Su; Jibing Chen; Chiewlan Tai; Shijie Cai;
Computer Vision and Pattern Recognition, 2000. Proceedings. IEEE Conference on , Volume: 1 , 13-15 June 2000
Pages:383 - 388 vol.1

[Abstract]   [PDF Full-Text (180 KB)]    IEEE CNF
 Mingyan:  The existing algorithms do the vectorization in two steps: crude vectorization and postprocessing. The first step extract possible line segements, and the second step extend and combine the line segments into real graphic entities, like intersection, arc, etc. The authors of this papers presented a global algorithm. It find a Seed Segment(SS), and using its direction and width to Guide the Tracking in its Direction to find more pixels. After tracking based on this SS, delete the raster pixels processed, and find another SS to continue, until all pixels are processed. In a Direction Guided Tracking, find the intersections entities based on three categories of junctions: PJ(perpendicular Junction), OJ(Oblique Junction), and UJ(Undetermined Junction).  Based on the junction's priority, the intersections will be checked if they are a kind of junction. The authors showed that this algorithm is fastest comparing with SPV(Sparse Pixel Vectorization) and SBV(Skeletonization Based Vectorization methods).



Sparse pixel vectorization: an algorithm and its performance evaluation (Read again if needed)
Dori, D.; Wenyin Liu;
Pattern Analysis and Machine Intelligence, IEEE Transactions on , Volume: 21 , Issue: 3 , March 1999
Pages:202 - 215

[Abstract]   [PDF Full-Text (860 KB)]    IEEE JNL

Mingyan:  From my opinion, I think this is a good paper: good idea, good experiments evaluation and comparisions. This paper's authors overviewed the developed algorithms into two classes: thinning based and nonthinning based methods. For each class, they analyzed the advantages and disadvatnages. (Thinning: +: maintain the line connectivity; -: loss of line width information: Nonthinning: +: edge, contour; -: gaps in the vectors, need postprocessing). This paper's algorithm belongs to the second class. SPV works in two steps, first find the medial axis chain; then refine the result which focus on junctions because the first step may fail in the intersections.
This algorithm is familiar to me because Bob tried on this before.


Reference: [1]  'Some Experiments in Image Vectorization', J.Jimenez and J.L.Navalon. IBM  Res. Develop. 1982
                   and [2] 'A comparison of Line Thinning algorithms from Digital Geometry Viewpoint', H. Tamura 1978 (so old...)


Cooperative text and line-art extraction from a topographic map
Luyang Li; Nagy, G.; Samal, A.; Seth, S.; Yihong Xu;
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on , 20-22 Sept. 1999
Pages:467 - 470

[Abstract]   [PDF Full-Text (60 KB)]    IEEE CNF   stop here


 Arc segmentation from complex line environments-a vector-based stepwise recovery algorithm
Dori, D.; Liu Wenyin;
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on , Volume: 1 , 18-20 Aug. 1997
Pages:76 - 80 vol.1

[Abstract]   [PDF Full-Text (440 KB)]    IEEE CNF


Sparse pixel tracking: a fast vectorization algorithm applied to engineering drawings

Liu Wenyin; Dori, D.;
Pattern Recognition, 1996., Proceedings of the 13th International Conference on , Volume: 3 , 25-29 Aug. 1996
Pages:808 - 812 vol.3

[Abstract]   [PDF Full-Text (492 KB)]    IEEE CNF




Other approaches:

26. Example-driven Graphics Recognition
@misc{ evaluation-sparse,
author = "Its Performance Evaluation",
title = "Sparse Pixel Vectorization: An Algorithm and",
url = "citeseer.ist.psu.edu/544647.html" }

Discussion/Survey:

24. Some papers form Science Direct
ABI: analogy-based indexing for content image retrieval
Context-dependent segmentation and matching in image database
Image retrieval using resegmentation driven by query rectangles
A survey on image-based rendering-representation, sampling and compression




Other:


23. A list of  IEEE xplore papers (search results based on 'vectorization')

 A model-based line detection algorithm in documents
Yefeng Zheng; Huiping Li; Doermann, D.;
Document Analysis and Recognition, 2003. Proceedings. Seventh International Conference on , 3-6 Aug. 2003
Pages:44 - 48 vol.1

[Abstract]   [PDF Full-Text (310 KB)]    IEEE CNF



Form frame line detection with directional single-connected chain
Yefeng Zheng; Changsong Liu; Xiaoqing Ding; Shiyan Pan;
Document Analysis and Recognition, 2001. Proceedings. Sixth International Conference on , 10-13 Sept. 2001
Pages:699 - 703

[Abstract]   [PDF Full-Text (288 KB)]    IEEE CNF

 
 Improving arc detection in graphics recognition
Dosch, P.; Masini, G.; Tombre, K.;
Pattern Recognition, 2000. Proceedings. 15th International Conference on , Volume: 2 , 3-7 Sept 2000
Pages:243 - 246 vol.2

[Abstract]   [PDF Full-Text (268 KB)]    IEEE CNF


 


 An object-oriented progressive-simplification-based vectorization system for engineering drawings: model, algorithm, and performance
Jiqiang Song; Feng Su; Chiew-Lan Tai; Shijie Cai;
Pattern Analysis and Machine Intelligence, IEEE Transactions on , Volume: 24 , Issue: 8 , Aug. 2002
Pages:1048 - 1060

[Abstract]   [PDF Full-Text (2643 KB)]    IEEE JNL
Mingyan:  Inspired by the way humans read drawing, the authors use object-oriented approach in the vectorization, instead of the traditional way(first convert raster to raw vectors, then recognize obejcts from these vectors). They defined a class hierachy: line(->bar, -> arc, or -> curve), symbol, and text. They start with the simple ones, i.e., lines, and delete them to recognize the complicated objects. For each of these classes, they define some algorithms. For lines, they use the algorithm of 'Line Net Global vectorization algorithm",



 Vectorization in graphics recognition: to thin or not to thin
Tombre, K.; Tabbone, S.;
Pattern Recognition, 2000. Proceedings. 15th International Conference on , Volume: 2 , 3-7 Sept 2000
Pages:91 - 96 vol.2

[Abstract]   [PDF Full-Text (436 KB)]    IEEE CNF
Mingyan: It's discussion paper about skeletonization,
Feature analysis using line sweep thinning algorithm
Fu Chang; Ya-Ching Lu; Pavlidis, T.
Page(s): 145-158
[Abstract]  [PDF Full-Text (376 KB)]   




 Empirical performance evaluation of graphics recognition systems
Phillips, I.T.; Chhabra, A.K.;
Pattern Analysis and Machine Intelligence, IEEE Transactions on , Volume: 21 , Issue: 9 , Sept. 1999
Pages:849 - 870

[Abstract]   [PDF Full-Text (620 KB)]    IEEE JNL
Mingyan: evaluation

 




 Stepwise segmentation and interpretation of section representations in vectorized drawings
Grabowski, H.; Chenguang Liu; Michelis, A.;
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on , 20-22 Sept. 1999
Pages:677 - 680

[Abstract]   [PDF Full-Text (84 KB)]    IEEE CNF


 A Hough-based method for hatched pattern detection in maps and diagrams
Llados, J.; Marti, E.; Lopez-Krahe, J.;
Document Analysis and Recognition, 1999. ICDAR '99. Proceedings of the Fifth International Conference on , 20-22 Sept. 1999
Pages:479 - 482

[Abstract]   [PDF Full-Text (132 KB)]    IEEE CNF


A knowledge-based automated vectorizing system for geographic information system
Kyong-Ho Lee; Sung-Bae Cho; Yoon-Chul Choy;
Pattern Recognition, 1998. Proceedings. Fourteenth International Conference on , Volume: 2 , 16-20 Aug. 1998
Pages:1546 - 1548 vol.2

[Abstract]   [PDF Full-Text (84 KB)]    IEEE CNF




 

 Variations on the analysis of architectural drawings
Ah-Soon, C.; Tombre, K.;
Document Analysis and Recognition, 1997., Proceedings of the Fourth International Conference on , Volume: 1 , 18-20 Aug. 1997
Pages:347 - 351 vol.1

[Abstract]   [PDF Full-Text (504 KB)]    IEEE CNF

 



 


 
Efficient analysis of complex diagrams using constraint-based parsing
Futrelle, R.P.; Nikolakis, N.;
Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on , Volume: 2 , 14-16 Aug. 1995
Pages:782 - 790 vol.2

[Abstract]   [PDF Full-Text (644 KB)]    IEEE CNF


 

  Vision knowledge vectorization: converting raster images into vector form
Jennings, C.; Parker, J.R.;
Pattern Recognition, 1994. Vol. 1 - Conference A: Computer Vision & Image Processing., Proceedings of the 12th IAPR International Conference on , Volume: 1 , 9-13 Oct. 1994
Pages:311 - 315 vol.1

[Abstract]   [PDF Full-Text (368 KB)]    IEEE CNF

 A thresholding algorithm based on the local information
Fang, M.; Li, L.; Yu, T.;
Robotics and Automation, 1993. Proceedings., 1993 IEEE International Conference on , 2-6 May 1993
Pages:1012 vol.3

[Abstract]   [PDF Full-Text (52 KB)]    IEEE CNF

 

  Development of a map vectorization method involving a shape reforming process
Tanaka, N.; Kamimura, T.; Tsukumo, J.;
Document Analysis and Recognition, 1993., Proceedings of the Second International Conference on , 20-22 Oct. 1993
Pages:680 - 683

[Abstract]   [PDF Full-Text (340 KB)]    IEEE CNF



 


 

22.  Neural edge enhancer for supervised edge enhancement from noisy images
Suzuki, K.; Horiba, I.; Sugie, N.;
Pattern Analysis and Machine Intelligence, IEEE Transactions on  ,Volume: 25 , Issue: 12 , Dec. 2003
Pages:1582 - 1596

[Abstract]  [PDF Full-Text (2902KB)]   IEEE JNL


21. Statistical edge detection: learning and evaluating edge cues
Konishi, S.; Yuille, A.L.; Coughlan, J.M.; Song Chun Zhu;
Pattern Analysis and Machine Intelligence, IEEE Transactions on  ,Volume: 25 , Issue: 1 , Jan. 2003
Pages:57 - 74

[Abstract]  [PDF Full-Text (2145KB)]   IEEE JNL

 

20.  Segmentation of prostate boundaries from ultrasound images using statistical shape model
Dinggang Shen; Yiqiang Zhan; Davatzikos, C.;
Medical Imaging, IEEE Transactions on  ,Volume: 22 , Issue: 4 , April 2003
Pages:539 - 551

[Abstract]  [PDF Full-Text (2367KB)]   IEEE JNL
 

19.  Two Bayesian methods for junction classification
Cazorla, M.A.; Escolano, F.;
Image Processing, IEEE Transactions on  ,Volume: 12 , Issue: 3 , March 2003
Pages:317 - 327

[Abstract]  [PDF Full-Text (1485KB)]   IEEE JNL

18. An evaluation of parallel thinning algorithms for character recognition
Lam, L.; Suen, C.Y.
Page(s): 914-919
[Abstract]  [PDF Full-Text (628 KB)]   
PAMI Volume: 17,   Issue: 9,   Year: Sep 1995

17. Thinning methodologies-a comprehensive survey
Pattern Analysis and Machine Intelligence, IEEE Transactions on
Volume: 14,   Issue: 9,   Year: Sep 1992
Lam, L.; Lee, S.-W.; Suen, C.Y.
Page(s): 869-885(thinningsurvey.pdf)

16. C. Stauffer, W.E.L. Grimson, “Learning patterns of activity using real-time tracking”, IEEE TRANS. PAMI, 22(8):747-757, August 2000

15. A. Tsai, A. Yezzi, W. Wells, C. Tempany, D. Tucker, A. Fan, E. Grimson, A. Willsky, “A shape based approach to curve evolution for segmentation of medical imagery, IEEE Trans. Medical Imaging, 22(2), Feb. 2003

14.Spatial template extraction for image retrieval by region matching
Jun-Wei Hsieh; Grimson, W.E.L.
Page(s): 1404- 1415(spatialTemplate.pdf)
Image Processing, IEEE Transactions on
Volume: 12,   Issue: 11,   Year: Nov. 2003

13. Anti-geometric diffusion for adaptive thresholding and fast segmentation
Manay, S.; Yezzi, A.
Page(s): 1310- 1323(anisotropic diffusion.pdf)
Image Processing, IEEE Transactions on
Volume: 12,   Issue: 11,   Year: Nov. 2003

12. Performance assessment of feature detection algorithms: a methodology and case study on corner detectors
Rockett, P.I.
Page(s): 1668- 1676  (features.pdf)
Image Processing, IEEE Transactions on
Volume: 12,   Issue: 12,   Year: Dec. 2003

--- some surveys ---
11.Thinning Methodologies - A Comprehensive Survey
10. Document Representations
9. Twenty Years of Document Image Analysis in PAMI
8.On-line  Graphics Recognition: A Brief Survey
-----------------
 
7. Stable and Robust Vectorization: How to make the right choices

6. Improving the Accuracy of Skeleton-based vectorization

5.Detection and Enhancement of Line Structures in an Image by Anisotropic Diffusion

4. A survey of moment-based techniques for unoccluded object representation and recognition (not yet)

3. On Image Analysis by the Methods of Moments

2. Moment Analysis

1. Adaptable Vectorization System based on Strategic Knowledge and XML representation use
 
  

7. Stable and Robust Vectorization: How to make the right choices  
Info of Paper
    GREC'99
    Tombre, Karl
    Ah-Soon, Christian
    Dosch, Philippe
    Springer LNCS 1941 pp3-18, 2000
Abstract
    In this paper, we discuss the elements to be taken into account when choosing one's vectorizaiton method. The paper is extensively based on our own implementations and tests, and concentrates on methods designed to have few, if any, parameters.

Summary
    An ideal vectorization system should be sufficiently stable and robust. One important factor of robustness is to minimize the number of parameters and thresholds needed in the vectorization process. They work on an approach combing several methods, each of which having no or very few parameters.

    Four steps are involved in vectorization:
    1. First find the lines in the original raster image. Whereas the most common approach for this is to compute teh skeleton of the image, a number of other methods have been proposed.
    2. Next approximate the lines found into a set of vectors. This is performed by some polygonal approximation method, and there are many around, with different approximation criteria.
   3. It's necessary to perform some post-processing, to find better position for the junction points, to merge some vectors and remove some others, etc.
   4. find the circular arcs. Not explained this step in this paper.

When finding lines, several approaches are used:
   One method is to compute the medical axis, i.e., skeletonizaiton;
   The second methods is based on matching the opposite sides of the line. This method is better at positioning the junction points, but tend to rely too much of heuristics and thresholds when the drawing become complex.
   Some sparse-pixel approaches are also used in this paper. The general idea is not to examine all the pixels in the image, but to use appropriate sub-sampling methods which give a broader view of thte line.

From lines to segments
   If simplicity of the resulting set of vectors is important, the best choice is probably an iterative method. It will give a number of segments closest to the number in the original drawing. However, it's o optimal with respect to positioning of these segments.
  If the precision is the most important criterion, Rosin&West's method seems to be a good choice.  It also does not require any explicit threshold or paramter.

Post-processing


It's difficult to provide a universal measure for assessing the performance of vectorization, it's believed taht the elements of choice given can be complementary to statistical performanec evaluation processes.




 6. Improving the Accuracy of Skeleton-Based Vectorization
Info of Paper
        GREC2002
        Hilaire, Xavier
        Tombre, Karl
        Springer LNCS2390 pp273-288, 2002
Abstract
Summary


5. Detection and Enhancement of Line Structures in an Image by Anisotropic Diffusion
 

Info of Paper

             Lecture Notes in Computer Science

            Publisher: Springer-Verlag Heidelberg
            ISSN: 0302-9743
            Volume: Volume 2059 / 2001
            Date: January 2001
            Page: 313
            Title:  Visual Form 2001: 4th International Workshop on Visual Form, IWVF4 Capri, Italy, May 28-30, 2001, Proceedings:
            Author: Koichiro Deguchi, Tadahiro Izumitani, Hidekata Hontani
            Editors:  C. Arcelli, L.P. Cordella, G. Sanniti di Baja (Eds.):
           Full Text(pdf)

Abstract:

This paper describes a method to enhance line structures in a gray level image. For this purpose, we blur the image using anisotropic gaussian filters along the directions of each line structures. In a line structure region the gradients of image gray levels have a uniform direction. To find such line structures, we evaluate the uniformity of the directions of the local gradients. Before this evaluation, we need to smooth out small structures to obtain line directions. They, first, blur the given image by a set of gaussian filters. The variance of the gaussian filter which maximizes the uniformity of the local gradient directions is detected position by position. Then, the line directions in the image are obtained from this blurred image. Finally, they blur the image using anisotropic filter again along the directions, and enhance every line structure.
Keywords: Line structure enhancement, multi-resolution analysis, anisotropic filter, structure tensor
Summary:
An image may have both local and global structure, this paper focuses on global structure. To enhance a global line structure of gray-level image, people need to use some techinique to ignore the local small structures. Gaussian filter is often used to blur off such small structures. But if the local lines are not isolated in the image, the result of filterring is not good. The reason of the bad performance is that the neighbours will influence each other. If some way can be found to diffuse the local lines only along the direction of the line structure, the global result may be better.
Some research shows that gaussian filter has different sizes in its directions, i.e., the shape of gaussian filter is anisotropic. So people can smooth out the lines if they know the direction of line structure.
This paper proposed a method to determine the proper parameters of the gaussian filter to smooth out only local small structures and enhance global line structures adatpively to a given image, and to each positions in the iamge. Their approach can be described in several steps:
1. multi-resolution image analysis
    This step is kind of preprocess. By viewing the image with various resolutions, they can find a critical moment that the global line structure appears to be perfect(if the purpose is to recognize the global shape of the figure). They need to find the factor t in u(x,y,t) which make the critical moment.
2. Evaluation of line-likeness
    In the neighborhood of a line structure, the gradients of the image gray-level f(x,y) have the same direciton toward the center of the line. So, if the gradients of gray-levels have equal direction in a small neighbor region, the image is defined to have a line structure at that point. Line-likeness is defined by the amount of how similar the directions of the gradients of u(x,y,t) are in the neighbour of the image point.
    Gradient space, structural-analysis tensor, eigen vectors, and eigen values are introduced. Line-likeness S(x,y) at a position(x,y) can be calculated, and its value spans betwen 0 and 1. If S(x,y) is almost 1, the gray level around (x,y) has a line-like structure, and if S(x,y) is almost 0, they have not.
3. Mutlti-scale evaluation
    There is a parameter p in S(x,y), and different p will lead to different S(x,y). They detect the global line structure of the original image by blurring it with the p/2 which make S(x,y) become maximal.
4. Anisotropic diffusion to enhance line structure
    The previous steps help to find a global line structure, but the image is a blurred one and the detected line structure is faded. This step will smooth out gray level changes only within the line structure and enhance it with clear contour edges. To blur an image only within a specific direction, the anisotropic diffusion has been proposed. This paper proposed the determination of the suitable diffusion tensor to enhance line structure by using the evaluation of the line structure S(x,y).

This paper proposed a technique to emphasize the global line structure of a gray-level image. First, by changing resolution, they obtain proper resolution to make global line structure clear. Then, get the direction of the line, and smooth out only in this direction. The global line structures are enhanced.
 

4.A survey of moment-based techniques for unoccluded object representation and recognition

 

Info of Paper
            R. J. Prokop and A. P. Reeves.
            CVGIP: Graphics Models and Image Processing.

            54(5):438--460, 1992.


           Full Text(pdf)
 

Coming soon.
 

3. On Image Analysis by the Methods of Moments
Info of paper
            On Image Analysis by the Methods of Moments
            CHo-Huak, TEH, Roland T. Chin
            IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol10, No.4, 1988

Summary
    This paper discussed several moments analysis approaches in image processing, and compared them. In the past decades, reserachers proposed several kinds of moments, for example, regular moment(geometric moment), Zernike moments(orthogonal moment), Legendre moments(orthogonal), pseudo-Zernike moment(orthogonal), complex moments, etc. This paper discussed the sensitivity to image noise, aspect to information rebundancy, and capability for image represenation, etc.
    To the noise analysis, it shows that higher order moments are more vulnerable to noise, and the number of coefficients (and hence the set of moments up to a certain order) for optimal image representation can be determined under a given noisy condition. In terms of imformation redundancy, the orthogonal moments (i.e., Legendre, Zemike, and pseudo-Zemike) are better than the other types of moments. In terms of overall performance, Zemike and pseudo-Zemike moments outperform the others.
 

My view of this paper
    Since our diagarms have no much noise, regular moments may be just fine for our    work.
 
 

Not a paper, just something from web about Moment Analysis
    Hu(MK, Hu) has proved that it's possible to completely represent an image by its moments. And, researchers have found tat a small number of moments, e.g., the first 30 or so, suffice to describe an object with useful accuracy.
    In typical imaging application, segmentation of the image is first performed to generate a binary image with object, or foreground, pixels labeled as such and non-object, or background pixels, set to some appropriate , different, value, for example, to 0. The moment of the foreground pixels are then computed and used to characterize the object in each image. These images can be thus represented as a FEATURE VECTOR comprised of the all moments up to some order. Comparison of images, and thus the objects they contain, is reduced to a numerical measure of distance between these corresponding.
 
 
 
 
 

1.Adaptable Vectorization System based on Strategic Knowledge and XML representation use

Info of this paper

Delalandre Mathieu, Saidali Youssouf, Ogier Jean-Mare, Trupin Eric

PAI lab, Univ. of Rouen, France

L3I lab, Univ. of La Rochelle, France

GRECÕ03

1. Summary

This paper presents a vectorization system. The system has two parts: processing library and graphics user interface. The processing library includes image pre-processing and vectorization tools. The pre-processing tools deal with noisy images. The vectorization tools are of high granularity level in order to use them in a strategic approach. The GUI allows to constructing and executing different vectorization scenarios. This makes it easy to test different strategies according to the recognition goals, and to adopt the system to new applications. XML is used to represent data for data manipulation.

Vectorization is a complex process that may rely on many different methods. Some methods first extract object graphs and then transform object lists into mathematical object lists. Some other methods perform directly vectorization.

Vectorization systems basically use two types of knowledge: descriptive knowledge and strategic knowledge. The first one cares about the objects into documents and the relations between them. The second cares the image processing tools used to construct the objects and chaining relations between these tools. In this paper, strategic knowledge based vectorization system is implemented.

This paper explains the system in the following sequence: processing library, GUI, and XML. This summary follows that sequence.

Image processing library ( Image pre-processing and Vectorization)

Image pre-processing

This step is designed to deal with noisy images, and there are several ways to implement it. This paper does like this: First they use grey-level filtering methods on scanned images like median filter and mean filter. Second, they binarize these images Then, reduce noise on obtained binary images. They use two methods in reduction. The first one is a method based on blob coloring algorithm which uses automatic or pre-defined user surface threshold. The second method uses mathematical morphology operations like dilation, erosion, opening and closing. Finally, they use distance computation functions between images to test the pre-processing scenarios. See Fig(1) for example.
 

Vectorization

The vectorization processing is based on various approaches (skeletonisation, contouring, region and run decomposition, direct vectorization). They decomposed the classical vectorization chain into granular processings to use them within a strategic approach: image level, structured data level, and boundary level which is between the image data and structured data.

VectorizationÕs Image Level

In this level, they use two classical image processing methods: contouring and skeletonisation.

VectorizationÕs Boundary Level

They use six different methods to extract structured data from images. Using Direct Contouring, they extract internal and external shapeÕs contours, and construct them into chains. By searching the contour chains, the inclusion relations are extracted. This method gives global/local descriptions of imageÕs shapes. Method Direct Vectorization works like this: after finding an entry point, a point element advances from this entry point into the lineÕs middle according to the contours following. The displacementÕs length is proportional to the lineÕs thickness. Method Run Decomposition first divide images into runs. The runs are organized into run graphs, either horizontal or vertical. For each of these runs, the contours and skeleton are extracted. Method Region Decomposition is based on wave aggregation. It first analyzes the image to find an entry poirt. Then it searches the neighbor points and label, aggregate, and store these neighbors into a wave object. Successively, the previous waves are used for the new aggregation processes. When the wave break or stop, the region boundaries are defined there. The boundaries are then used to create entry waves for the new region search. The examples for the above four methods are shown in Fig(3)

VectorizationÕs Structured Data Level

The structured data level is the central part of vectorization scenario. Its goal is to add semantic information to basic graphs obtained by the boundary levelÕs processing. One way to do it is List Processing. List processing can be used in interiority degree segmentation and mathematical approximation. For the interiority degree segmentation, we apply a thickness segmentation threshold based on a simple test of the thicknessÕs variation. Information of the pixelsÕ interiority degree is obtained by the successive calls of skeletonisation tools.

GUI

GUI is used to for the strategic knowledge acquisition and operation. Users can construct scenarios according to the document context, for the purpose of document image recognition. After the users define some contexts for the analyzed images, a set of processing is proposed. Each processing represents a scenario stage. The user oversees the process. The user can at any time return to any previous stage in order to modify the parameters, change processing stage, seek help, or display some examples. User can also save scenario examples in the scenario base, and can also search the base with two search tools: a query language, or a graph matching tools. See Fig(7).

XML

XML is used for better knowledge representation. In processing library, XML is used for the structured data output of processing, while the GUI use it for the scenario storage into an XML base.

2. My view of this paper

This paper did a good job on a vectorization system in which they applied many related technologies. They pre-processed the image to reduce noise, decompose images into three levels to use them with semantic information and strategic approaches, and implements GUI to make it easy to use.

My question about this paper is the strategic approach, described in part four. The authors say itÕs a highlight of this paper. It seems to me that it focus on GUI, not the idea of how to do the vectorization. So, I donÕt think itÕs technically important, if I understand the paper correctly.

3. Its possible contribution to our research

It provides some different approaches in several stages of vectorization, though many of them are standard methods. We can consult these methods when we work on SVP. They also provide good references.