The Graduate Seminar
Series is a student organized series that was started by Talib Hussein in the
fall of 1998. Since that time, the series has provided a friendly, informal
setting for graduate students to give presentations to their peers and learn
new ideas and techniques that may assist in their own research. Food has always
been an important component of the series and again this year free food and
drink are being provided at each seminar for all attendees.

The seminars have four
purposes:

To encourage graduate
students to interact on a research level.

To foster a cooperative
social and research spirit among the students.

To allow students topractice
their presentation skills and gain useful feedback from their
peers.

In
previous years presentations have been given related to:
research (e.g. thesis work, conference talks),
graduate course work, degree
requirements (e.g. depth paper, thesis proposal, thesis defense) and research positions ("job talks"). Presentations
on both work-in-progress and completed work are encouraged. Please note
that priority is given to students needing to practice their defense
talk, and rescheduling of existing talks may occur as a result.

Contacts for the Series:

If you are interested in giving a presentation this term,
please contact Mohamed Hefny or Hung Tam (see
contact information below).

Identifying single nucleotide polymorphisms (SNPs)
that are responsible for common and complex diseases, such as
cancer, is of major interest in current molecular epidemiology.
However, due to the tremendous number of SNPs on the human genome,
to expedite genotyping and analysis, there is a clear need to
prioritize SNPs according to their potentially deleterious effects
to human health. As of yet, there have been few efforts to quantitatively
assess the possible deleterious effects of SNPs for effective
association studies. Here we propose a new integrative scoring
system for prioritizing SNPs based on their possible deleterious
effects in a probabilistic framework. We also provide the evaluation
result of our system on the OMIM (Online Mendelian Inheritance
in Man) database, which is one of the most widely used databases
of human genes and genetic disorders.

Nov 11, 2008 2:30pm - 3:30pm

Goodwin 524

Acoustic
Emissions of Handwriting

Andrew Seniuk

Handwriting and speech recognition are problems
with a long history. However, no studies have considered the
sounds produced by handwriting, an information source which
has connections to both of the aforementioned. This presentation
will summarise my work in pen acoustic emissions, including
a few demonstrations, early results on recognition of cursive
handwritten characters, other possible applications, and some
hypotheses for discussion.

Genetic variation analysis holds much promise
as a basis for understanding disease-gene association. In particular,
single nucleotide polymorphisms (SNPs) are at the forefront
of such studies, as they are the most common form of DNA variation
on the genome. However, due to the tremendous number of candidate
SNPs, there is a clear need to expedite genotyping and analysis
by selecting and considering only a subset of all SNPs. In this
talk, I will present several successful applications of machine
learning to address the problem of SNP selection and to improve
current state-of-the-art SNP selection methods. Our first method
is based on the tag SNP selection approach, which aims to select
a subset of SNPs whose allele information can best represent
the allele information of unselected SNPs. Using the formalism
of Bayesian networks, the proposed method is able to select
a subset of independent and highly predictive SNPs, without
limiting the number or the location of predictive tag SNPs.
Our second method is based on the functional SNP selection approach,
which aims to directly select a subset of SNPs that are likely
to be disease-causing. In the probabilistic framework, our integrative
scoring system combines the functional assessments from a variety
of bioinformatics tools, and prioritizes SNPs according to their
potential deleterious effects to human health. Last, I describe
our new multi-objective optimization framework for identifying
SNPs that are both informative tagging and have functional significance.

Mar 17, 20092:30pm - 3:30pm

Goodwin 524

Identifying
Common Substructural Patterns of Protein Contact Maps

Hazem Ahmed

1D protein sequences, 2D contact maps and 3D structures
are three different representational levels of detail for proteins.
Predicting protein 3D structures from their 1D sequences remains
one of the complex challenges of bioinformatics. The “Divide
and Conquer” principle is applied in our research to handle
this challenge, by dividing it into two separate yet dependent
subproblems, using a Case-Based Reasoning (CBR) approach. Firstly,
2D contact maps are predicted from their 1D protein sequences;
secondly, 3D protein structures are then predicted from their
predicted 2D contact maps. We focus on the problem of identifying
common substructural patterns of protein contact maps, which
could potentially be used as building blocks for a bottom-up
approach for protein structure prediction. We further demonstrate
how to improve identifying these patterns by combining both
protein sequence and structural information. We assess the consistency
and the efficiency of identifying common substructural patterns
by conducting statistical analyses on several subsets of the
experimental results with different sequence and structural
information.