Image Captioning in News Photography

Verbal description of image contents is a central problem of both Computer Vision and Natural Language Processing (NLP). The solution of this problem could stand at the base of various applications, such as sentence based images search, assisting accessibility of images to the visually impaired, etc. In the past few years, many attempts were made to advance this specific field of study and to solve the problem in multiple ways. Among others, solutions have been proposed using Machine Learning and Deep Learning methods (such as Convolutional Neural Networks), and variations of Canonical Correlation Analysis. In this project, we analyze an attempt to create an image captioning system for news photographs, using recent improvements in the field of Canonical Correlation Analysis (namely, NCCA), as well as building on the inherent advantages of the database of news photographs donated by photographer Sebastian Scheiner. This project analyzes the advantages and disadvantages of solving the problem using this method, and compares the success of the solution to other, similar solution attempts.