Performance Comparison of Six Algorithms for Page Segmentation

Faisal Shafait, Daniel Keysers, Thomas Breuel
In: Horst Bunke, A. Lawrence Spitz (eds.) 7th IAPR Workshop on Document Analysis Systems (DAS) volume 3872, LNCS, Pages 368-379, Nelson, Springer, Nelson, New Zealand, 2/2006

Abstract:

This paper presents a quantitative comparison of six algorithms for page segmentation: X-Y cut, smearing, whitespace analysis, constrained text-line finding, Docstrum, and Voronoi-diagram-based. The evaluation is performed using a subset of the UW-III collection commonly used for evaluation, with a separate training set for parameter optimization. We compare the results using both default parameters and optimized parameters. In the course of the evaluation, the strengths and weaknesses of each algorithm are analyzed, and it is shown that no single algorithm outperforms all other algorithms. However, we observe that the three best-performing algorithms are those based on constrained text-line finding, Docstrum, and the Voronoi-diagram.

Files:

  FsDkTmbPerfComp6AlgDAS2006.pdf

BibTex:

@inproceedings{ SHAF2006,
	Title = {Performance Comparison of Six Algorithms for Page Segmentation},
	Author = {Faisal Shafait and Daniel Keysers and Thomas Breuel},
	Editor = {Horst Bunke, A. Lawrence Spitz},
	BookTitle = {7th IAPR Workshop on Document Analysis Systems (DAS)},
	Month = {2},
	Year = {2006},
	Series = {LNCS},
	Publisher = {Springer},
	Publisher = {3872},
	Pages = {368-379},
	Address = {Nelson, New Zealand}
}

     
Last modified:: 30.08.2016