Using Neural Networks to Classify Paintings by Genre

Jana Zujovic, Lisa Gandy, Scott Friedman

Northwestern University, EECS 349, Professor Bryan Pardo

Contact: friedman[at]northwestern.edu

Overview

Impressionism example

In this paper, we describe our proposal for a neural network-driven solution to identify and classify digital images of paintings into their most appropriate artistic genre. Investigating digital art classification will provide answers to several questions in both artistic and engineering communities:

Which extractable features of artistic imagery are most indicative of a painting's genre?

Are paintings classifiable by extractable technique-based features alone, or are there content-level (semantic) considerations involved?

Additionally, crafting a solution for efficient painting classification is a first step in enabling higher-level systems to classify large repositories of art, make artistic recommendations, and perhaps even analyze artistic influences.

Painting Classification

Surrealism: semantic analysis necessary

Art professionals classify paintings using variables such as stroke style, color mixing, edge softness, color "reflection," parallel lines, and gradients; however, as painting is often considered a representative visual language, there are a number of content-level criteria that we will not encode or address in our proposed computational solution. For example, it is not within the scope of our project to detect the surrealism of the ontological dilemma posed by Magritte's Ceci n'est pas une pipe (at right) or the surreal melting clocks of Salvador Dali. In summary, we will extract features and classify paintings based on technique.

Realism example

The classification variables mentioned above are most commonly represented with real numbers - that is, with histograms, percentages, means, and deviations. For this reason, utilizing a naive Bayesian classifier or ID3 decision tree would require us to transform the input data into discrete property sets, which might alter our feature granularity and improperly bias the classification. We utilize an Artificial Neural Network (ANN) as our learner, which allows us to preserve these classification variables in their most common representations. We discuss ANNs in more detail below, and provide further ANN motivation in our Approach below.

Neural Networks

Pop Art example

Neural networks provide a reliable method of learning real- and discrete-valued target functions; as such, they have been utilized accordingly in recent years to recognize handwriting, spoken words, musical genres, human faces, and roadside obstacles. The success of neural networks in image classification encourages us to investigate their use in classifying images according to artistic technique.

Other texts can provide a more in-depth Artificial Neural Network (ANN) information, but for our current purposes, it suffices to say that an ANN is a network of simple interconnected units that can be taken to represent biological neurons. Each of these neuron-units (termed perceptrons by many) take as input several real numbers and output a single real number, which may be used as an input for subsequent "layers" of perceptrons.

As mentioned above, neural networks can be trained to classify data so that the ANN takes input (real-numbered feature set) and computes its output (real-numbered classification) based on perceptron weights that are manipulated during training. During the training phase, the output is compared to the actual classification, and an algorithm such as Backpropagation is utilized to modify the weights of each perceptron so that the computed output will more closely resemble the intended output. We discuss our specific configuration of the ANN, as well as our training and testing specifics, in our Approach section below.

Approach

Here we discuss the painting data, the features we extracted from the data, and our ANN learner. Each of these subsections help set the stage for our experimentation, where we run our learner and test the results.

Data

Abstract Expressionism example

We gathered a total of 358 paintings, each of which belonged to one of the following five genres, as indicated by the sources below:

Abstract Expressionism

Cubism

Impressionism

Pop Art

Realism

From the preceding genres, 60 paintings were downloaded in the abstract expressionism genre, 62 paintings in the cubism genre, 97 paintings in the impressionism genre, 58 in the pop art genre, and 81 in the realism genre.

In post-processing of the downloaded data, we cropped out frames which were part of some of the images, as these are artifacts of the presentation context and are not representative of the painting genre. Moreover, we downscaled high-resolution paintings to normalize the resolution and thereby normalize the texture (brush strokes, paint gloss, and canvas striations that are only noticeable at high resolution).

Feature Analysis

We provide a brief overview of feature extraction on this website - for more extraction specifics, see the final report.

We used MatLab to extract all of our features from the painting images. We began by extracting the RGB (Red, Green, Blue) histograms from each painting, and then transformed it into HSV (Hue, Saturation, Value) space. The data displayed below are averages on a per-genre basis, to display (average) relationships across genres. These trends help us to identify which feature dimensions are most valuable for differentiation and thereby for classificaiton.

The next consideration was edge density, also extracted with MatLab. We measured the densities at four edge thresholds: 0.2, 0.3, 0.4, and 0.6. As the threshold increases, the algorithm is more conservative about which color boundaries it reports as an edge.

We have included here the figures that are most useful in differentiating genres, per the average genre values. We can see from the RGB data that Impressionism tends to employ more mid-tone greens (bins 3-6), Realism utilizes more dark reds (bin 2), and Pop Art employs more white (highest bin) than their counterparts. In the Value (brightness) dimension, we see three distinct functions for Impressionism, Pop Art, and Realism. In terms of edge density, we see some further distinctions across genres.

This analysis gives us reason to believe that we can use an ANN to learn the differences between genres and identify the genre based on these extracted features. Because Cubism and Abstract Expressionism are so closely related on each of these dimensions, we expect these genres to pose the greatest challenge in classification.

Experimentation

Topology of our ANN

We utilized K-fold cross-validation for training and testing purposes, which resulted in five sets of about seventy paintings each.

Our learner was implemented in EasyNN, a third-party ANN package. We structured our network as in the figure at right, with a hidden layer of eight nodes, and a single output for each target genre. We trained each fold for 10,000 cycles with a decaying learning rate of 0.7 and a decaying momentum of 0.8.

The first configuration is our originally proposed corpus of art; the second configuration we utilize solely to measure our accuracy without the two most challenging data categories.

Results

The most naive classifier would choose Impressionism every time, as Impressionism is our largest target genre, at 97 of 358 data points. This would result in classifying 97/358 = 27% of the examples correctly for scenario (1), and 97/236 = 40% correctly for scenario (2).

Accuracy of testing ANN classification

The overall ANN results are at right. We computed these results by training and testing five times, per our K-fold cross-validation strategy, above. We then computed the accuracy overall, and on a per-genre basis.

We see that the ANN overall accuracy was 55%, which outperformed the most naive classification strategy, as discussed above, by two-fold. We also find that the conclusions from our data analysis were accurate - Abstract Expressionism and Cubism are the most difficult to classify.

In our simplified three-genre scenario, the learner performed at 71% accuracy overall, which outperforms the naive classifier at 40% accuracy. Moreover, the Pop Art accuracy rose to 88% accuracy.

Summary

Our ANN classifier performed at 55% accuracy for five artistic genres and 71% accuracy for three artistic genres, based on brightness, edge density, and Gabor filter encoding. As such, the ANN learner outperforms a naive pick-the-most-probable-classifier strategy by about two-fold. Though the results demonstrate that a machine can indeed differentiate painting genres to some degree, it is highly possible that extracting more features from these paintings (i.e. geometric analysis, line curvature) could prove fruitful and improve upon this classification system. For further suggestions and discussion, see our final report.