Refine

Language

Keywords

1 search hit

Today the process of improving technology and software allows to create, save and explore massive data sets in little time. "Big Data" are everywhere such as in social networks, meteorology, customersâ€™ behaviour â€“ and in biology. The Omics research field, standing for the organism-wide data exploration and analysis, is an example of biological research that has to deal with "Big Data" challenges. Possible challenges are for instance effcient storage and cataloguing of the data sets and finally the qualitative analysis and exploration of the information. In the last decade largescale genome-wide association studies and high-throughput techniques became more effcient, more profitable and less expensive. As a consequence of this rapid development, it is easier to gather massive amounts of genomic and proteomic data. However, these data need to get evaluated, analysed and explored. Typical questions that arise in this context include: which genes are active under sever al physical states, which proteins and metabolites are available, which organisms or cell types are similar or different in their enzymesâ€™or genesâ€™ behaviour. For this reason and because a scientist of any "Big Data" research field wants to see the data, there is an increasing need of clear, intuitively understandable and recognizable visualization to explore the data and confirm thesis. One way to get an overview of the data sets is to cluster it. Taxonomic trees and functional classification schemes are hierarchical structures used by biologists to organize the available biological knowledge in a systematic and computer readable way (such as KEGG, GO and FUNCAT). For example, proteins and genes could be clustered according to their function in an organism. These hierarchies tend to be rather complex, and many comprise thousands of biological entities. One approach for a space-filling visualization of these hierarchical structured data sets is a treemap. Existing algorithms for producing treemaps struggle with large data sets and have several other problems. This thesis addresses some of these problems and is structured as follows. After a short review of the basic concepts from graph theory some commonly used types of treemaps and a classification of treemaps according to information visualization aspects is presented in the first chapter of this thesis. The second chapter of this thesis provides several methods to improve treemap constructions. In certain applications the researcher wants to know, how the entities in a hierarchical structure are related to each other (such as enzymes in a metabolic pathway). Therefore in the 3 third chapter of this thesis, the focus is on the construction of a suitable layout overlaying an existing treemap. This gives rise to optimization problems on geometric graphs. In addition, from a practical point of view, options for enhancing the display of the computed layout are explored to help the user perform typical tasks in this context more effciently. One important aspect of the problems on geometric graphs considered in the third chapter of the thesis is that crossings of edges in a network structure are to be minimized while certain other properties such as connectedness are maintained. Motivated by this, in the fourth chapter of this thesis, related combinatorial and computational problems are explored from a more theoretical point of view. In particular some light is shed on properties of crossing-free spanning trees in geometric graphs.