Search

美国佐治亚理工学院李锦辉教授访问实验室

2016年10月15日，前贝尔实验室研究科学家、国际著名语音处理资深学者、IEEE Fellow、美国佐治亚理工学院李锦辉教授来访实验室。
当日上午，在谢磊老师和付中华老师的陪同下，李锦辉教授参观了陕西省语音与图像信息处理重点实验室。谢磊教授着重介绍了实验室近年来在音频、语音与语言处理方面的最新成果和与业界的合作，大家就双方感兴趣的内容进行了深入交流与探讨。
当日下午，李锦辉教授在学院105报告厅给大家带来了题目为“A Reverberation-Time-Aware DNN Approach to Speech Dereverberation （一种基于混响-时间-感知深度神经网路的语音去混响方法）”的学术报告，就DNN在语音增强、语音分离和去混响方面的最新研究成果进行了报告。报告会结束后，在座同学们纷纷提出了自己感兴趣的问题，李锦辉教授一一做出了详细的回答。本场报告会使同学们获益良多，尤其是对于语音去混响方法有了更深刻的认识，同时也体会到了去混响技术在语音识别中的重要作用，对今后相关方向的学习和研究过程都有很大的启发性的作用。
报告人简介：Professor Chin-Hui Lee, Georgia Institute of Technology, USA(李锦辉，美国佐治亚理工大学教授，IEEE Fellow，ISCA Fellow，前贝尔实验室资深语音研究员，Bell Labs President's Gold Award获得者，论文超过30000引用，H-index 65。)
Chin-Hui Lee is a professor at School of Electrical and Computer Engineering, Georgia Institute of Technology. Before joining academia in 2001, he had 20 years of industrial experience ending in Bell Laboratories, Murray Hill, as a Distinguished Member of Technical Staff and Director of the Dialogue Systems Research Department. Dr. Lee is a Fellow of the IEEE and a Fellow of ISCA. He has published over 450 papers and 30 patents, with close to 30,000 citations and an h-index of 65 on Google Scholar for his publications. He received numerous awards, including the Bell Labs President's Gold Award in 1998. He won the SPS's 2006 Technical Achievement Award for “Exceptional Contributions to the Field of Automatic Speech Recognition”. In 2012 he gave an ICASSP plenary talk on the future of speech recognition. In the same year he was awarded the ISCA Medal in scientific achievement for “pioneering and seminal contributions to the principles and practice of automatic speech and speaker recognition”. See http://chl.ece.gatech.edu/ for details.
报告摘要：
We cast the classical speech dereverberation problem into a regression setting by mapping log power spectral features of reverberant speech to time-delayed features of anechoic speech. Depending on the reverberation time of the acoustic environment we found that different signal processing parameters are needed to deliver a good quality for dereverberated speech. Furthermore, reverberant-time-aware DNN training and decoding procedures can be designed to optimize the dereverberation performance across a wide range of reverberant times. In addition, a single DNN can also be trained to perform simultaneous beamforming and dereverberation for microphone array speech. Furthermore, as a side benefit, using DNN-based speech dereverberation as a pre-processor in the REVERB Challenge automatic speech recognition (ASR) task, we get the lowest word error rate without retraining the dereverberation front-end and the ASR back-end. It is expected the ASR accuracy and robustness could still be improved with joint training of an integrated dereverberation-ASR system.