Abstract

Genomic signal processing is a new area of research that combines genomics with digital signal processing methodologies for enhanced genetic data analysis. Microarray is a well known technology for the evaluation of thousands of gene expression profiles. By considering these profiles as digital signals, the power of DSP methods can be applied to produce robust and unsupervised clustering of microarray samples. This can be achieved by transferring expression profiles into spectral components which are interpreted as a measure of profile similarity.
This thesis introduces enhanced signal processing algorithms for robust clustering of micro array gene expression samples. The main aim of the research is to design and validate novel genomic signal processing methodologies for micro array data analysis based on different DSP methods. More specifically, clustering algorithms based on Linear prediction coding, Wavelet decomposition and Fractal dimension methods combined with Vector quantisation algorithm are applied and compared on a set of test microarray datasets. These techniques take as an input microarray gene expression samples and produce predictive coefficients arrays associated to the microarray data that are quantised in discrete levels, and consequently used for sample clustering.
A variety of standard micro array datasets are used in this work to validate the robustness of these methods compared to conventional methods. Two well known validation approaches, i.e. Silhouette and Davies Bouldin index methods, are applied to evaluate internally and externally the genomic signal processing clustering results.
In conclusion, the results demonstrate that genomic signal processing based methods outperform traditional methods by providing more clustering accuracy. Moreover, the study shows that the local features of the gene expression signals are better clustered using wavelets compared to the other DSP methods.