The mandate of the Human Metabolome Project is
to identify, quantify, catalogue and store all
metabolites that can potentially be found in human tissues and biofluids at
concentrations greater than one micromolar.

We have some preliminary results on chemo-informatics:
learning to predict properties of chemicals
(here, solubility of metabolites).

Oncologists often use MRI scans to identify the tumour region within a brain.
Unfortunately, these scans typically reveal only a portion of the tumour.
This means the obvious treatment --- irridating all-&-only the visible tumour
region --- is not effective, as the "occult" tumour will continue to thrive,
to the patient's detriment. Many physicians will therefore irridate a 2cm
margin around the visible region; unfortunately, it is not clear that this
enlarged region contains only tumour, nor that it contains all of the tumour.

We are therefore seeking a way to predict the location of the occult tumour
cells. Given the assumption that these regions that are occult today will
become visible later, we are therefore attempting to learn how the tumour will
grow, as a function of tumour's location and other properties (size, type).
As training data, we have access to the data (MRI scans, etc) from 650
previous patients, where each patient visited between 1 and 11 times, often
over years.

The PolyomX Project represents a systematic approach to link the knowledge
gained from the human genome project to healthcare. It was established to
take advantage of Alberta's unique, province-wide cancer registry. This
registry, established in 1975, allows institutes such as the
Cross Cancer Institute
(CCI) to centralize the diagnosis, treatment and follow-up of all of
the province's cancer patients. This centralized cancer registry also allows
the CCI researchers access to thousands of anonymous tumor and blood samples
as well as anonymized patient profiles. The result is a one-of-a-kind tumor
bank that allows large-scale population studies to be performed at a
genome-wide level.

These samples are then analyzed using the most recent advances in genomics,
proteomics, metabolomics and bioinformatics, towards developing a
broad-spectrum molecular analysis of human cancers and their correlations to
certain clinical outcomes. We anticipate that this wealth of information will
lead to major advances in the fight against cancer, eg, helping us to
understand why some patients benefit from a given drug while others suffer
adverse reactions.

To date, we have results on...

predicting which women are likely to develop breast cancer

which men will have long-term bleeding problems after
applying radiation treatment to address
prostate cancer

We plan to pursue similar studies with the other types of cancers already
being tumor-banked here: (breast), lung, ovarian, gastro
intestinal/colorectal, lymphoma, prostate and certain leukemia. We will also
extend these analysis to include other patient information, including
gene-expression data (from microarrays) and metabonomic information (eg, based
on NMR-based urinalysis).

Many web recommendation systems direct users to webpages, from a single
website, that other similar users have visited. By contrast, our WebIC web
recommendation system is designed to locate "information content (IC) pages"
--- pages the current user needs to see to complete her task --- from
essentially anywhere on the web. WebIC first extracts the "browsing
properties" of each word encountered in the user's current click-stream ---
eg, how often each word appears in the title of a page in this sequence, or in
the "anchor" of a link that was followed, etc. We have used the data
collected from a set of annotated web logs acquired in a user study, to
produce user- and site-independent rules that identify which words are likely
to appear in a page is an IC-page for a specific session --- eg, of the form:

"any word with the properties
(1) whenever it appears in an anchor, the user tends to follow that anchor,
and
(2) whenever it appears in the title of a page, the user seldom "backs out"
from that page,
tends to be a word that appears in the IC-page".

Notice this rule deals only with how the word appears in the current browsing
session, which mean a word can be an IC-word for one session but not for another,
even for the same user. Our empirical results show that the learned
classifier works effectively --- that is, it can accurately identify which
words will appear in the IC-pages.

We are beginning two new studies --- one to try out our system in the context
of a paticular set of e-commerse tasks, and the other, both to collect a
larger set of data of how a wider selection of users browse the web for a
larger set of contexts, and also to explore various ways to use these IC-words
to determine which pages best satisfy the user's information need (based on
the context of her browsing behavior). We will also attempt to find
user-specific rules, to determine if different users have significantly
different mappings from browsing properties to "IC-ness".