Abstract

Manga, or comics, which are a type of multimodal artwork, have been leftbehind in the recent trend of deep learning applications because of the lack ofa proper dataset. Hence, we built Manga109, a dataset consisting of a varietyof 109 Japanese comic books (94 authors and 21,142 pages) and made it publiclyavailable by obtaining author permissions for academic use. We carefullyannotated the frames, speech texts, character faces, and character bodies; thetotal number of annotations exceeds 500k. This dataset provides numerous mangaimages and annotations, which will be beneficial for use in machine learningalgorithms and their evaluation. In addition to academic use, we obtainedfurther permission for a subset of the dataset for industrial use. In thisarticle, we describe the details of the dataset and present a few examples ofmultimedia processing applications (detection, retrieval, and generation) thatapply existing deep learning methods and are made possible by the dataset.