Overview

A major component of machine learning and data mining is dealing with the high dimensional data that arises. Typical examples include pixels from an image (millions of dimensions), medical data bases (perhaps hundreds of dimensions, often with missing values), video clips and speech signals (time series data of very high dimensions), and gene expression data (expression values of many thousands of genes). Dealing with high dimensional data is a key challenge for modern computer science.

In this course we will consider how to develop appropriate algorithms for modelling and visualizing these high dimensional data sets and gain insights into these algorithms from theoretical and empirical perspectives. We will demonstrate how essential algorithms are derived in a step-by-step way as well as how important algorithms can be applied through the use of examples and real world problems. We will cover approaches from machine learning, statistics and neural computation in this advanced machine learning and data mining course unit.

Aims

This course unit aims to introduce students to state-of-the-art approaches to dealing with high dimensional data based on dimensionality reduction and provides experience of research such as literature review and appraising research papers in modelling and visualization of high dimensional data. In particular, transferable knowledge/skills, essential to original researches, are highlighted in this course unit.

Syllabus

Introduction/Background

Mathematics Basics

Principal component analysis (PCA)

Linear discriminative analysis (LDA)

Self-organising map (SOM)

Multi-dimensional scaling (MDS)

Isometric feature mapping (ISOMAP)

Locally linear embedding (LLE)

Teaching methods

Lectures

three hours per week (5 weeks)

Laboratories

three hours per week (5 weeks)

Feedback methods

In general, feedback is available for the assessed work.

For coursework, the feedback to individuals will be offered during on-site marking in the lab.

For exam, the general feedback to the whole class will be given in writing.