Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Breast legion

1.
Detection of Breast Lesions in Medical Digital Imaging Using Neural Networks Gustavo Ferrero, Paola Britos and Ramón García-Martínez Software & Knowledge Engineering Center. Graduate School. Buenos Aires Institute of Technology Intelligent Systems Laboratory. School of Engineering. University of Buenos Aires. rgm@itba.edu.ar Abstract. The purpose of this article is to present an experimental application for the detection of possible breast lesions by means of neural networks in medical digital imaging. This application broadens the scope of research into the creation of different types of topologies with the aim of improving existing networks and creating new architectures which allow for improved detection.1. IntroductionBreast cancer has been determined to be the second leading cause of cancer death inwomen, and the most common type of cancer in women; there are no official statisticsin the Argentine Republic, but it is estimated that 22 in 100,000 women are affectedby this illness, similarly to what is observed in other Western countries [Mols et al,2005]. The mammography is the best method of diagnosis by images that exists at thepresent time to detect minimum mammary injuries, fundamentally small carcinomasthat are shown by micro calcifications or tumors smaller than 1cm. of diameter thatare not palpated during medical examination. [Antonie et al, 2001]. Currently, jointefforts are being made in order to be able to detect tissue anomalies in a timelyfashion, given that there are no methods for breast cancer prevention. Early detectionhas proved an essential weapon in cancer detection, since it helps to prolong patientslives. Physicians providing test results must have diagnostic training based onmammography, and must issue a certain number of reports annually. Double readingof reports increases sensitivity for detection of minimal lesions by about 7%, thoughat a high cost. The physician shall then interpret these reports and determineaccording to his/her best judgment the steps to be taken for the proper diagnosis andtreatment of the patient. for this reason, physicists, engineers, and physicians are insearch of new tools to fight cancer, which would also allow physicians to obtain asecond opinion [Gokhale et al, 2003, Simoff et al, 2002]. The American College ofRadiology having approved the use of new digital mammographs, digital photos havebegun to be stored in databases together with the patients information, for laterprocessing via different methods [Selman, 2000]. Different methods have been usedto classify and/or detect anomalies in medical images, such as wavelets, fractal________________Please use the following format when citing this chapter:Ferrero, G., Britos, P., García-Martínez, R., 2006, in IFIP International Federation for Information Processing, Volume218, Professional Practice in Artificial Intelligence, eds. J. Debenham, (Boston: Springer), pp. 1-10.

2.
2 Ferrero, Britos, García-Martíneztheory, statistical methods and most of them used features extracted using image-processing techniques. In addition, some other methods were presented in theliterature based on fuzzy set theory, Markov models and neural networks. Most of thecomputer-aided methods proved to be powerful tools that could assist medical staff inhospitals and lead to better results in diagnosing a patient [Antonie et al, 2001]. Fig. 1. Processing StepsDifferent studies on using data mining in the processing of medical images haverendered very good results using neural networks for classification and grouping. Inrecent years different computerized systems have been developed to supportdiagnostic work of radiologists in mammography. The goal of these systems is tofocus the radiologists attention on suspicious areas. They work in three steps: i.analogic mammograms are digitized; ii. images are segmented and preprocessed; iii.Regions of Interests (ROI) are found and classified by neural networks [Lauria et al,2003].2. Proposed MethodRadiologists do not diagnose cancer versus benign nodules; they detect suspiciousregions and send them for additional work up [Baydush et al, 2001]. Bearing in mindthe way medical imaging specialists work, the system works as follows: i. capturingmedical image, ii. storing image on the data base, iii. starting up processing, iv.generating report, v. validating report. The first and last steps generate informationwhich provides a work environment where system users are given the ability to createnew network topologies in order to validate the results obtained.2.1. Mammography ProcessingFigure 1 shows the stages that have been adopted for the image processing of amammography. Stage 1: image pre-processing; this stage begins by acquiring theimage, which is delivered to the following stage containing only the region of interest.Stage 2: image classification to determine whether or not it contains malignant lesionsthat require in-depth examination by specialists. Stage 3: if the classifier determines

3.
Professional Practice in Artificial Intelligence 3that the image shows malignant lesions, suspicious areas are scanned for. Stage 4:mammography processing report generated.2.2 Mammography Pre-processingThe first stage contains a set of steps which as a group serve the purpose ofeliminating all information which is irrelevant for the classification. Step 1. Usingmedian filter. Step 2. Cropping margins. Step 3. Eliminating isolated regions. Step 4.Equalizing. Order filters are based on a specific treatment of image statistics calledorder statistics. These filters operate in the neighborhood of a certain pixel, known aswindow, and they replace the value of the central pixel. Order statistics is a techniquethat organizes all pixels in a window in sequential order, on the basis of their greylevel [Liew et al, 2005]. The M mean in a set of values is such that half of the valuesin the set are smaller than M and half of the values are greater than M. In order tofilter the mean in the area around the neighborhood, we ranked the intensities in theneighborhood, we determined the mean, and assigned the latter to the intensity of thepixel. The main purpose of mean filtering is to cause the points with very differentintensities to become similar to their neighbors, thus eliminating any isolated intensitypeaks that appear in the area of the filter mask. The median filter is a nonlinear filter,used in order to eliminate the high-frequency filter without eliminating the significantcharacteristics of the image. A 3x3 mask is used, which is centered on each imagepixel, replacing each central pixel by the mean of the nine pixels covering the mask.The window size allows the characteristics of the image to me preserved while at thesame time eliminatinng high frequencies [Díaz, 2004]. Next the automatic cropping isperformed. The purpose of this step is to focus the process exclusively on the relevantbreast region, which reduces the possibility for erroneous classification by areaswhich are not of interest. Image segmentation is an important step in several imageapplications. A host of techniques and algorithms usually fall into this generalcategory as a starting point for edge detection, region labelling, and transformations.All these techniques, region labelling, and analyses, are relatively simple algorithmsthat have been used for many years to isolate, measure, and identify potential regions[Jankowski and Kuska, 2004]. A stack method is used for region labelling, as it isone of the fastest and simplest to implement. After labelling, those areas which arenot of interest to the study are eliminated. It is known that the surface covered by thebreast is over 80%; therefore, isolated areas with surfaces smaller than 1% do notbelong to the breast and are eliminated through the creation of masks obtained fromneighboring pixels. Lastly, a uniform equalization is performed, which will essentially help enhance image contrast. F(g) = [ gmax – gmin] Pp(g) + gminWhere gmax, and gmin correspond to the maximum and minimum intensity values inthe range of grey values of the image. Figure 2 shows the results of image pre-processing.

4.
4 Ferrero, Britos, García-Martínez Fig. 2. Automatic Pre-processing2.3 ClassificationNeural networks are models which attempt to emulate the behavior of the brain. Assuch, they perform simplification, identifying the relevant elements in the system. Anadequate selection of their features coupled with a convenient structure constitutes theconventional procedure utilized to build networks which are capable of performing agiven task [Hertz et al, 1991]. Artificial neural networks offer an attractive paradigmfor the design and analysis of adaptive, intelligent systems for a wide range ofapplications in artificial intelligence [Fiszelew et al, 2003]. Artificial neural networksare based on a rather simple model of a neuron. Most neurons have three parts, adendrite which collects inputs from other neurons (or from an external stimulus); asoma which performs an important nonlinear processing step; finally an axon, a cable-like wire along which the output signal is transmitted to other neurons is calledsynapse [W. Gestner, NA]. Neurons are grouped in layers; these interconnectedlayers form a neural network, thus each neural network is composed of N number oflayers (Figure 3). Depending on how these components (layers) are connected,different architectures may be created (feed forward NN, recurrent NN, etc.). Thetopology or architecture of a neural network refers to the type, organization, andarrangement of neurons in the network, forming layers or clusters. The topology of amultilayered neural network depends on the number of variables in the input layer,the number of hidden neuron layers, the number of neurons per each hidden layer, andthe number of output variables in the last layer. All these factors are important whendetermining network configuration [Zurada, 1995].Thus, neural network structures can be defined as collections of parallel processorsinterconnected in the form of an oriented lattice, arranged in such a way that thenetwork structure is appropriate for the problem under consideration. The connections

5.
Professional Practice in Artificial Intelligence 5between neurons in a neural network have an associated weight, which is what allowsthe network to acquire knowledge. The most commonly used learning écheme for theMLP is the back-propagation algorithm. The weight updating for the hidden layersadopts the mechanism of back-propagated corrective signal from the output layer. Fig. 3. Neural NetworkIt has been shown that the MLP, given flexible network/neuron dimensions, offers anasymptotic approximation capability. It was demonstrated that two layers (one hiddenonly) perceptrons should be adequate as universal approximators of any nonlinearfunctions [Kung et al, 1998]. A multilayer perceptron is structured as follows: Function signal: the signal that propagates from the input to output. Error signal: generated by output neurons and it backpropagates as an adjustment to the synaptical connections towards the input in order to adjust the output obtained to the expected output as faithfully as possible.Thus all output neurons and those in the hidden layer are enabled to perform twotypes of calculations according to the signal they receive: If it is a function signal, itwill be a forward calculation (forward pass); if it is an error signal, it will be abackward calculation (backward pass). The rule of propagation for neurons in thehidden layer is the weighted sum of the outputs with synaptic weights wji, then, asigmoid transference function is applied to that weighted sum and is limited in theresponse. Basically, the backpropagation algorithm is based on error minimization bymeans of a traditional optimization method called gradient descent. That is, the keypoint consists in calculating the proper weights of the layers from the errors in theoutput units; the secret lies in evaluating the consequences of an error and dividingthe value among the weights of the contributing network connections. Neural networklearning can be specified as a function approximation problem where the goal is tolearn an unknown function ?:RN ? R (or a good approximation of it) from a set ofinput-output pairs S = {(xN, y) | xN ∈ RN, y ∈ R} [Parekh, et al, 2000]. Pattern

6.
6 Ferrero, Britos, García-Martínezclassification is a special case of function approximation where the function’s outputy is restricted to one of M (M ≥ 2) discrete values (or classes). A neural network forsolving classification problems typically has N input neurons and M output neurons.The kth output neuron (1 ≤ Κ ≤ Μ) is trained to output one (while all the other outputneurons are trained to output zero) for patterns belonging to the kth class. A singleoutput neuron suffices in the case of problems that involve two categoryclassifications. The multilayer perceptron facilitates the classification of nonlinearproblems; the more hidden layers in a neural network, the simpler it will be to isolatethe problem (Figure 4). Structure XOR Class Fig. 4. Geometric Interpretation of the Role of Hidden Layers2.4. Image FeaturesA radiological image is formed by tissue absorption when exposed to Roentgenradiation. Depending on the amount of radiation absorbed, an object (tissue) may beradiopaque (RO), radiolucid (RL), or radiotransparent (RT). When a low amount of X-ray radiation is absorbed by the object, virtually all the rays reach the film; the color appears dark and it is a radiotransparent body (air cavities). When a moderate amount of radiation is absorbed by the object, the color appears grey and it is a radiolucid body (noncalcified organic tissue). When a large amount of X-ray radiation has been absorbed or it has been completely absorbed, the color appears light or white and it is a radiopaque body (inorganic tissue, calcified tissue).The range of colors is related to and depends on the extent of X-ray absorption by thetissue. The more radiation that is absorbed by the tissue, the less radiation will reachthe film, signifying that the body is radiopaque; the less radiation that is absorbed bythe tissue, the more radiation will reach the film, signifying that the body isradiolucid. This feature of X-ray images will be taken into account and will be used in

7.
Professional Practice in Artificial Intelligence 7order to obtain information about the sample tissues. The input layer is formed by Kinput neurons; this value is obtained by considering: the regions defined for the image(image subdivisions), the amount of statistical operations applied to those regions,plus one neuron according to the position of the breast (left or right) and a set of threeneurons according to the type of tissue (dense, dense-glandular, and glandular). Thecharacteristics of each of the regions are obtained from the information provided bythe tissues represented in the pixels. Data extraction in the regions is carried out viathe following statistical procedures:Mean: it is the mean data value.Bias: it is the systemic error which frequently occurs.Kurtosis: it measures whether the distribution values are relatively concentrated around the mean values of the sample.Variance: it measures the existing distance between the values and the mean.The mean is defined by the following function:Bias is defined:Kurtosis is defined by the following formula:Lastly, variance is given by:The system offers the possibility of creating different topologies for multilayernetworks (N ≥ 3, where N is the number of layers). This feature allows the user toresearch new architectures. As it has been mentioned before, image information(regions) is entered in the network, for instance; for an image divided into 16 regions,there are 69 (16 * 4 + 1 + 3 + 1) neurons including the independent term; the outputlayer may have only one or two neurons.2.5. TrainingSupervised learning is characterized by a controlled training by an external agent(supervisor or teacher) who determines the response to be generated by the network

8.
8 Ferrero, Britos, García-Martínezfrom a certain input. The supervisor verifies the network output and in case it doesnot match the expected output, connection weights are to be modified, in order toobtain an output as close as possible to the expected one [Hertz et al, 1991]. Theproposed method for the system is learning by trial and error, which consists inadjusting the weights of the connections according to the distribution of the quadraticerror among the expected responses rq and the appropriate current responses Oq.Training data are mammographies obtained by Mammographic Image AnalysisSociety (MIAS); they consist in 322 images from which 55 are to be analyzed in orderto obtain further information, which shall be used to train the neural network.The following information is necessary for neural network training:Stop error: it is the acceptable output error for training, below this error, training is brought to a halt and weights are said to converge.Number of cycles: it sets the maximum number of cycles that need to be learned by the network.Momentum: each connection is given a certain inertia or momentum, in such a way that its tendency to change direction with a steeper descent be averaged with the “tendencies” for change of direction that were previously obtained. The momentum parameter must be between 0 and 1 [Zurada, 1995].Learning ratio: it is a value between 0 and 1 used in error distribution (delta rule).The training procedure is evaluated every thousand cycles, obtaining informationabout the last ten cycles, performing an evaluation of the mean, in a way that thespeed of network convergence can be determined, thus allowing to conclude trainingand start with a new set of weights.3. Experimenting with the SystemThe system allows for the generation of different architectures, for which reasondifferent networks are evaluated, bearing in mind certain factors, whenever a newarchitecture is created. The performance (and cost) of a neural network on any givenproblem is critically dependent, among other things, on the network’s architecture andthe particular learning algorithm used. [Fiszelew et al, 2003]. Too small networks areunable to adequately learn the problem well while overly large networks tend to overfit the training data and consequently result in poor generalization performance[Parekh, et al, 2000]. With varying numbers of hidden layers, the followingconfiguration has yielded the best results: 69 Neurons in the input layer. 16 Neuronsin the hidden layer, one for each region the image was divided into. 4 Neurons in thefollowing hidden layer, one for each operation performed on the regions. 2 Outputneurons, one for each possible direction (right and left). In addition, two networkshave been set up for use depending on the location of the breast (left or right), since

9.
Professional Practice in Artificial Intelligence 9the initial results obtained using one single network for both were very poor. Trainingtimes were also reduced using this modality, thereby increasing the convergencespeed. The single network trained for both sides was able to classify correctly with a30% success rate, whereas the networks trained for a specific side have a 60% successrate. In addition, the former took over 4 hours to converge, while the second did so inless than 60 minutes. Further research will be done to identify the appropriate neuralnetwork that will allow classifications to be obtained with 80% certainty.4. Related WorkIn this research the neural net architecture is trained to distinguish malignant nodulesfrom benign ones. It differs from the CALMA project approach [Lauria et al, 2003]which has tools that identify micro calcification clusters and massive lesions.The proposed architecture deals with mammography images in image formats (JPG,BMP, TIFF, GIF), which differs from standard data mining approach based on imageparameters sets provided by the community [UCIMLR. 2006a.; 2006b].5. ConclusionsThis project has been an attempt to provide an environment for the continuedinvestigation of new neural network models that will achieve better results in theclassification of medical images. The project is open source, allowing access to thesource code so that others may study, improve, expand and distribute it, so thathealthcare institutions may have access to tools they can use to improve breast cancerdetection.The goal of the project is to improve the detection of areas suspected of containingsome type of lesion. The results obtained are very different from our initialexpectations, although the initial results obtained are promising. The future will bringimprovements to this application as well as the possibility of inputting additionalstatistical data that will enhance the reading of the images.Next research steps are: (a) to compare results of the proposed architecture in thispaper with others provided by vector machines based classifiers [Fung andMangasarian, 1999; Lee et al., 2000; Fung and Mangasarian, 2003]; and (b) to studyspecific filters that recognize structures in mammography images.6. BibliographyAntonie M., Zaïene O., Coman A. (2001). Application of data mining techniques for medical image classification. Proceedings of the Second International Workshop on Multimedia Data Mining. San Francisco.Baydush A., Cararious D., Lo J., Abbey C., Floyd C. (2001). Computerized classification of suspicious regions in chest radiographs using subregion hotelling observers. American Association of Physicists in Medicine. Vol 28 (12).