Finding Cysts, Part Five: Final Detection

Cyst Detection Using Convolutional Neural Networks

Graph-search

We have been working on automatically detecting the appearance of Cystoid Macular Edema (CME) in Optical Coherence Tomography (OCT) images. CME is a buildup of fluids in the eye near the macula of the eye that can distort a person’s vision. OCT is one of the main methods used by medical professionals to detect these edemas.

We can rephrase our goal of detecting the cysts as finding a closed line that separates the cysts’ pixels in the image from the rest of the image. If we define the scan image as a graph where every pixel is represented as a node that is connected by an edge to its neighboring pixels we could use the well-known graph-cut method, whose aim is to find on a graph where an image can be segmented into two groups – the cyst(s) and everything else. The graph-cut method evaluates each edge in the graph based on its score and chooses the edges that allow the segmentation with the lowest score possible.

Using the seed super-pixels that define the cyst super-pixels and negative super-pixels that define non-cyst super-pixels, both found in the seed detection process earlier, we will find a cut in the graph that holds the cyst pixels on the one side and all other pixels on the other.

The Algorithm

As we said, we define each pixel as a node in the graph, add a source node that is connected to every cyst point and a sink node that is connected to the outlier points.

Every node is connected to its Manhattan neighbors (4-neighborhoods) and the edges are given a score that is inverse to their intensity difference – i.e.the more similar the nodes, the higher their edges’ score. This ensures that the cut will be between very different pixels (as is the case on the edge that defines the cyst’s walls).

The edges that connect the source and the sink receive a score of infinity to ensure that they won’t be cut from the graph.

Finally a min-cut algorithm is used to find the actual segmentation.

Manual Annotation

Detected Cysts

As we can see in the resulting figures above the detection of the cysts’ pixels is very accurate with small false detections in the boundary areas where the cysts are especially close to one-another. The results themselves are dependent on using good seeds and are increasingly more accurate when more negative seeds are used.

Convolutional Neural Networks

Neural networks are a machine learning concept that aims to recreate the same learning and recognition processes as in the human brain. Convolutional neural networks are an expansion of this concept that has had great success in recent years, especially in tasks related to computer vision. Convolutional neural networks add an increased focus on locational and relational information in the images.

The network itself is composed of layers of mathematical operations that learn increasingly complex concepts from the images in an effort to learn to categorize them.

As an input our neural network accepts patches of pixels from within the retina. To generate those patches we first extract only the retina from the image, using the previously segmented ILM and RPE from the Layer Segmentation process. We further segment the retina using the SLIC algorithm, that was introduced in the Denoising process, into super-pixels. From each super-pixel we randomly choose a seed point which will be the focal point of the patch that represents the super-pixel.

Each super-pixel is labeled using manual annotation of the OCT scan and the patch that is generated from this inherits this label and is fed to the network.

The network that we use is very similar in its structure to the famous and widely used AlexNet network. Our network’s structure is as follows:

Convolutional layer

Pooling layer

Convolutional layer

Pooling layer

Convolutional layer

Relu layer

Convolutional layer

Softmax layer

Manual Annotation

Detected Cysts Using CNN

As we can see in the resulting images above, this deep learning method is very successful at finding the cysts themselves and most of their area. This is a remarkable result considering the small dataset that was used in the training process.