Here we provide demo results of our page dewarping algorithm that
is based on the assumption that the original page contained only straight
lines that were approximately equally spaced and sized.
This is often true for book pages.
Compare the publication
Adrian Ulges, Christoph H. Lampert, Thomas M. Breuel: Document Image Dewarping
using Robust Estimation of Curled Text Lines, International Conference on
Document Analysis and Recognition (ICDAR), pages 1001-1005, 2005.
Note that the algorithm has several limitations:
Some sample results (click on image to view dewarped version):
- it does not work if the assumption is not true for (parts of) the
document image, i.e. for headlines, paragraphs with spacing,
- we (try to) use the largest box of text within the image only
- we assume that the image is given in the correct orientation
- the maximum angle that any part of the text line can deviate from the
horizontal is currently set to 0.5 radians (about 30 degrees)
- if there are large spaces between words, the line tracker sometimes
tries to make two lines out of one; this results in visually strong