Please report any queries concerning the funding data grouped in the sections named "Externally Awarded" or "Internally Disbursed" (shown on the profile page) to
your Research Finance Administrator. Your can find your Research Finance Administrator at https://www.ucl.ac.uk/finance/research/rs-contacts.php by entering your department

We are developing a computational pipeline to analyse large surveys of Affymetrix GeneChips. In particular, we are studying the reliability of groups of probes to provide a coherent measure of transcription. We are interested in understanding the mechanisms underpinning co-expression in transcriptomic data, and using this knowledge to shed light on the systems biology of post-transcriptional processing of mRNA. We are analysing the expression of tens of thousands of microarray experiments, having downloaded the data from the Gene Expression Omnibus. Our survey samples many organisms, phenotypes, developmental and anatomical conditions. Because of this experimental diversity, any observed correlations between probes can be associated either with biology that is robust, such as co-expression common to all cells, or with systematic biases associated with the GeneChip technology. We proceed by generating a matrix of correlations of the intensities of all the probes within each of the probesets on the HGU-133A GeneChip. Contrary to the widespread belief of the microarray community we have discovered that in the majority of cases for Human Chips, probesets do not measure one unique block of transcription. Instead we see numerous examples of outlier probes. Our study has also identified that in a number of probesets the mismatch probes are an informative diagnostic of expression, rather than their usual role in providing a measure of background contamination. We report evidence for systematic biases in GeneChip technology associated with probe-probe interactions. We also see signatures in our matrices associated with post-transcriptional processing of RNA, such as splicing, alternative polyadenylation, chimeric transcripts and antisense transcription. Our results have widespread implications because of the pervasive use of GeneChip technology in modern biological research.