Historical Document Image Analysis and Recognition

Document Image Analysis and Recognition (DIAR) is an important field in pattern recognition and computer vision. The aim of DIAR is the automatic analysis of contents of document images (either printed or handwritten, textual or graphical), towards their recognition and understanding. Lately, thanks to the advances in this field, DIAR has become a fundamental technology for extracting information from document collections, thus helping in the preservation, access and indexing of cultural heritage. This lecture will overview the typical DIAR processes, such as document enhancement, layout analysis, Optical Character Recognition (OCR), handwriting recognition, keyword spotting or writer identification. This lecture will also show some examples of application of DIAR techniques to several Digital Humanities Projects.