Bibliographic Meta-Data Extraction Using Probabilistic Finite State Transducers

Martin Krämer, Hagen Kaprykowsky, Daniel Keysers, Thomas Breuel
Proceedings of the 9th Conference on Document Analysis and Recognition (ICDAR-2007), September 23-26, Curitiba, Brazil volume 2, Pages 609-613, IEEE, 9/2007

Abstract:

We present the application of probabilistic finite state transducers to the task of bibliographic meta-data extraction from scientific references. By using the transducer approach, which is often applied successfully in computational linguistics, we obtain a trainable and modular framework. This results in simplicity, flexibility, and easy adaptability to changing requirements. An evaluation on the Cora dataset that serves as a common benchmark for accuracy measurements yields a word accuracy of 88.5%, a field accuracy of 82.6%, and an instance accuracy of 42.7%. Based on a comparison to other published results, we conclude that our system performs second best on the given data set using a conceptually simple approach and implementation.

Files:

  BibMetaDataExtMkHkDkTmb.pdf

BibTex:

@inproceedings{ KRä2007,
	Title = {Bibliographic Meta-Data Extraction Using Probabilistic Finite State Transducers},
	Author = {Martin Krämer and Hagen Kaprykowsky and Daniel Keysers and Thomas Breuel},
	BookTitle = {Proceedings of the 9th Conference on Document Analysis and Recognition (ICDAR-2007), September 23-26, Curitiba, Brazil},
	Month = {9},
	Year = {2007},
	Publisher = {IEEE},
	Publisher = {2},
	Pages = {609-613}
}

     
Last modified:: 30.08.2016