Text versus non-Text Distinction in Online Handwritten Documents

Emanuel Indermühle, Horst Bunke, Faisal Shafait, Thomas Breuel
25th ACM Symposium On Applied Computing, Document Engineering Track, Sierre, Switzerland, ACM, 3/2010

Abstract:

The aim of this paper is to explore how well the task of text vs. non-text distinction can be solved in online handwritten documents using only offline information. Two systems are introduced. The first system generates a document segmentation first. For this purpose, four methods originally developed for machine printed documents are compared: x-y cut, morphological closing, Voronoi segmentation, and whitespace analysis. A state-of-the art classifier then distinguishes between text and non-text zones. The second system follows a bottom-up approach that classifies connected components. Experiments are performed on a new dataset of online handwritten documents containing different content types in arbitrary arrangements. The best system assigns 94.3% of the pixels to the correct class.

Files:

  Indermuehle-Online-Handwritten-Document-Segmentation-SAC10.pdf

BibTex:

@inproceedings{ INDE2010,
	Title = {Text versus non-Text Distinction in Online Handwritten Documents},
	Author = {Emanuel Indermühle and Horst Bunke and Faisal Shafait and Thomas Breuel},
	BookTitle = {25th ACM Symposium On Applied Computing, Document Engineering Track},
	Month = {3},
	Year = {2010},
	Publisher = {ACM}
}

     
Last modified:: 30.08.2016