The World Through the Eyes of a Computer, or How Facebook Tags Our Photos

Georgi guided his audience in a classroom at the Faculty of Mathematics and Informatics through the path leading to a better algorithm for image recognition by computers. As often happens with important discoveries, it is a path of trial and error, of gradual improvement and, as Georgi reminded us at the end of the lecture, of contributions made by outsiders to the field.

0

votes

This Wednesday, 26.11.2014, Georgi Botev immersed us into the fascinating field of computer vision – an area that combines diverse disciplines like Machine Learning, Artificial Intelligence and Neurobiology. Not surprisingly, Georgi also comes from a diverse background – he earned his Bachelor degree in Economics and his Masters in Statistics at the Sofia University, then delved into the world of credit risk prediction models at Experian before joining StatSoft to further pursue his ambitions in data science.

Georgi guided his audience in a classroom at the Faculty of Mathematics and Informatics through the path leading to a better algorithm for image recognition by computers. As often happens with important discoveries, it is a path of trial and error, of gradual improvement and, as Georgi reminded us at the end of the lecture, of contributions made by outsiders to the field. His first attempt was at facial recognition by detecting the coordinates of facial features such as the eyes, the nose, the mouth. The algorithms employ the coordinates for the images in the training set to recognize the faces in the out-of-sample images. Unfortunately, this approach does not cope well with images that differ a lot in resolution from those in the training set.

Then Georgi took a shot at the problem from a different angle, by detecting the saturation for each pixel. The idea is that some parts of the face are darker than others, for example the pupil and iris are darker than the rest of the eye. By using the saturation level of each pixel (between 0 and 255 for greyscaled images) you can create templates for each facial part using the difference of the sum of pixels saturation within the part and the sum of pixels in adjacent area. Most face finders, including Facebook’s, employ similar algorithms because they are fast enough.

Another advance in the area that Georgi described in detail is the usage of neural networks. These machine learning algorithms approximate the way the neurons in the human brain work. Just like human vision is enabled by the photoreceptor cells in the retina that catch the light, so does computer vision rely on sensors. Artificial neural networks are organized in layers similarly to the neurons in the brain, and each neuron or set of neurons is responsible for recognizing a different feature of the object. The newest generation of Neural Networks simulates the sparse distribution behaviour of brain neurons – only 3-5% of them are active at any time.

At the moment the neural network models are good at recognizing a particular object they were trained for, but the real challenge lies in designing Artificial Intelligence that emulates the way humans are able to recognize a new object. Such a self-training system could be able to detect in a film the important objects that appear with a frequency say over 1000 times. Georgi gave an example with the developments in the area of optical character recognition (OCR), where ANN algorithms beat humans by a small margin in identifying handwritten characters.

The lecture was followed by the traditional networking drinks, this time over a piece of pizza in a pizzeria nearby. Join us for our upcoming event – stay tuned by visiting our website, following our Facebook page or following our twitter account.