Contact Author

Current Version

Example images from the NEOCR dataset. Note that the dataset also includes images with text in different languages, text with vertical character arrangement, light text on dark and dark text on light background, occlusion, good and bad contrast..

Example of different text characteristics present in images of the NEOCR dataset, along with ground truth bounding boxes and distortion quadrangles.

Keywords

Description

The NEOCR dataset contains 659 real world images with 5238 annotated bounding boxes (textfields). The images were taken by several people independently from the dataset, so the dataset covers a broad range of characteristics which distinguish real world images from scanned documents. All text recognizable by humans has been annotated for all images. The dataset creation process was stopped when for each metadata dimension at least 100 textfields were included in the dataset.

The ground truth contains not only the visible text, but also distortion quadrangles, which enclose the visible text much more precisely than bounding boxes. The dataset is enriched with metadata consisting of brightness, contrast, inversion, texture, resolution, noise, blur, distortion, rotation, character arrangement, occlusion, typeface and language information. The annotation is provided in XML based on the schema of LabelMe.

Metadata and Ground Truth Data

The annotation was created manually by an adaptation of the LabelMe annotation tool. All text visible and recognizable by humans has been annotated for all images. The annotation is provided in XML, the schema of LabelMe was extended to our needs. The extended XMLschema is also provided as part of the dataset. Metadata is provided globally and locally.

Submitted Files

Disclaimer

By downloading and using the dataset you agree to acknowledge it's source and cite the above papers in related publications. Please link to the authors' Web page of the set as http://www6.cs.fau.de/neocr.