Author: khiet

Had a good time at the Lorentz workshop about Collecting, Annotating & Analyzing Video Data, held in Leiden from 30 Oct 2017 through 3 Nov 2017. Interesting discussions about how to improve existing annotation tools for analysis of (social) interactions.

Interspeech 2017 was held this year in Stockholm. It was great seeing everybody again. Jaebok Kim presented our paper on speech emotion recognition “in the wild” using multi-task learning. In our research field, social signal processing, rich big data is key. In general, we see that it is becoming more difficult for academia to compete with multinationals such as Google, Amazon, and Apple in terms of technology development and data acquisition. More funding should become available for academia to collect data that is both high of quality and quantity. I have been discussing with colleagues the impact that the rise of deep learning has had on our language and speech technology research in the past few years – recognition performances have increased remarkably. But this also gives us some food for thought – for example, what are the scientific questions in language and speech (technology) research that we are answering with these new methods, how can we explain the results that we have obtained with these methods?