Joseph (Yossi) Keshet

I am a professor at the Department of Computer Science at Bar-Ilan University. My research interests concern both machine learning and computational study of human speech and language.
In machine learning my research is focused on deep learning and structured prediction, while my research on speech and language is focused on speech processing, speech recognition, acoustic phonetics, and pathological speech.

My technological goal is to improve the state-of-the-art in applications such as automatic speech recognition, speech indexing and retrieval, acoustic scene analysis, and language understanding. My scientific goal is to contribute to research in human speech communication, phonetics, and medical speech pathology using data-driven methods. I believe that exploiting the structure of language and designing theoretically well-founded statistical machine learning algorithms for particular tasks that are able to make use of large datasets, can solve the complex problems involved in speech and language research. To a great extent, my research interests focus on interdisciplinary areas combining the fields of speech science, machine learning, and linguistics. I therefore constantly collaborate with colleagues from those fields.

News

Together with Benny Pinkas I organized a second focus day on Deep Learning and Security. See details here.

Together with Benny Pinkas I organized a focus day on Deep Learning and Security. See details here.

Our joint work with Moustapha Cissé and Natalia Neverova from Facebook on fooling structured deep learning models got attention and reviewed in New Scientist and MIT Technology Review.

Speech, Language and Deep Learning Lab

The research in the lab is focused on statistical and machine learning techniques applied to the modeling and processing of speech and language. A typical problem in speech and language processing has a very large number of training examples, is sequential, highly structured, and has a unique measure of performance. The lab's goal is to develop rigorous statistical and machine learning algorithms that maximize performance by matching the internal structure of the problem and by optimizing its unique measure of performance.

Resources and Code

The lab is commited to reproducible results. The GitHub repository gives you access to our code, tools and information on how to setup and use. {Deep} Phonetic Tools is a project done in collaboration with Matt Goldrick and Emily Cibelli, where we proposed a set of phonetic tools for measureing VOT, voswel duration, word duration and formants, and are all based on deep learning.

2012

Rohit Prabhavalkar, Joseph Keshet, Karen Livescu and Eric Fosler-Lussier, Discriminative Spoken Term Detection with Limited Data, 2nd Symposium on Machine Learning in Speech and Language Processing, Portland, Oregon, 2012. A description of the utterances used in our experiments can be downloaded from here.

Joseph Keshet, A Proposal for a Kernel-based Algorithm for Large Vocabulary Continuous Speech Recognition in Automatic Speech and Speaker Recognition: Large Margin and Kernel Methods, Joseph Keshet and Samy Bengio, Eds., John Wiley & Sons, August 2008.

Shai Shalev-Shwartz, Joseph Keshet and Yoram Singer, Learning to Align Polyphonic Music, The 5th International Conference on Music Information Retrieval (ISMIR), Barcelona, Spain, 2004. The Long version includes the proof of the theorem. Here you can find the dataset.

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.