Dr. Naomi Harte

ARAS AN PHIARSAIGH

Dr. Harte is an Assistant Professor in Digital Media Systems in the School of Engineering. She was appointed as an SFI Engineering Initiative Lecturer in Digital Media in 2008. Prior to returning to academia, Dr. Harte worked in high-tech start-ups in the field of DSP Systems Development, including her own company founded in 2002. She also previously worked in McMaster University in Canada. Dr. Harte's specialist area is Human Speech Communication. Her industrial background brings a real-world approach to her research. Her work involves the design and application of mathematical algorithms to enhance or augment speech communication between humans and technology. Since her appointment, she has established a strong international reputation in the speech processing community. Dr. Harte's research simultaneously represents academic excellence and industrial relevance. She has published over 60 peer reviewed papers in her specialist areas. For the past two years, Dr. Harte has been involved in a major collaboration with Google Chrome and YouTube, leading to multiple patent applications and publications.

Human speech is bimodal in nature. Incorporating visual features in Automatic Speech Recognition systems can improve performance in real environments. This work addresses core challenges in audio-visual speech recognition. It will develop new dynamic visual features that better capture the correlations in key mouth movements used by humans in lipreading. This is crucial in improving Hidden Markov Model performance. It will explore a new audio-fusion strategy motivated by the differing visibility of visemes allowing the influence of the audio and video stream to change over time.

Funding Agency

SFI

Programme

RFP

Project Title

Robust Speaker Verification

From

2009

To

2012

Summary

Biometrics involves the use of intrinsic physical or behavioural traits of humans to verify their identity. Traits used in biometrics typically include face, fingerprints, hand geometry, handwriting, iris, retinal, vein, and voice. Many are concerned that these technologies are potentially invasive and open to fraud. Speaker verification, using voice or voice and video, has been recognised as an important alternative in the world of biometrics. It is less invasive and requires less expensive installations that iris and fingerprint authentication systems. The changes that occur in the human voice due to ageing have been well documented. The impact of these changes on speaker verification is less clear. In this work, we examine the effect of long-term vocal ageing on a speaker verification systems.

Funding Agency

IRCSET

Person Months

36

Project Title

Audio-Visual Fusion for Human Computer Interaction.

From

2011

To

2014

Summary

This project will thus focus on key challenges in Audio Visual Speech Recognition: . Given state of the art audio and visual features, do early or late integration strategies work better? . How well does such an integration scheme translate to less controlled situations, where the speech is less constrained, intonation or prosody is more natural, or the speech is emotionally influenced? . Can these algorithms work on a real handheld device?