Document Image Analysis

The focus of our research in Document Image Analysis is on core document security. Much expertise has also been acquired in the domain of geometric and logical layout analysis. For more detail about our research work, please visit our publications page. The demonstrators on this page show some of the techniques we have developed in real world applications.

Document Security

Document security is gaining more and more importance in every day life. The widespread availability of scanners and printers allows even untrained people to easily forge and alter documents. This demo shows the application of automatic line orientation measurement for document security applications.

link to demo contact


OCRopus™ is a state-of-the-art document analysis and OCR system, featuring pluggable layout analysis, pluggable character recognition, statistical natural language modeling, and multi-lingual capabilities. The system is being developed with the generous support from Google and other organizations; the primary developers are at the IUPR Research Group.

link to demo OCRopus project homepage contact

Layout Analysis

Layout analysis remains as a significant performance limiting step in OCR and document analysis systems. To make progress in this area, we have developed a set of benchmarking tools and tasks and applied them to the evaluation of different document layout analysis methods.

link to demo contact

Page Dewarping

Digital cameras offer a fast, flexible, cheap, and widespread alternative for the capture of documents. Unfortunately, the acquired document images suffer besides illumination and resolution problems from distortion. While conventional dewarping approaches try to estimate this surface and thus demand a complicated calibration process, we developed a flexible snapshot-only approach.

link to demo contact

Document Rectification

The OSCAR demonstration system was created to demonstrate this possibility: for a document image, either supplied to the system in electronic form or captured on the fly using camera-based document capture, it finds the best planar document and dewarps it, keeping only the relevant document information and removing the surrounding background.

link to demo contact

Document Image Reflow for Handheld Devices

When viewing a document image on a hand-held device with a small screen, scrolling through the document becomes a major accessibility problem. The document reflowing capability of OCRopus rearranges the words in the document to allow viewing without the need for horizontal scrolling.

link to demo contact

Historical Document Analysis

The Bruno Project is developing tools for the automatic, image based comparison of historical documents. The goal is to make it easy for philologists to create critical editions of documents whose quality is too low to apply OCR techniques.

link to demo contact

Last modified:: 30.11.2009