Wednesday, December 16, 2015

A new non-profit research initiative for AI, OpenAI was announced this week. OpenAI is headed by Ilya Sutskever, one of the most famous deep learning researchers. Based on Ilya's background it seems like they will focus on the first stage on deep learning technologies.

Monday, December 14, 2015

I asked my colleague and friend Yishay Carmiel, head of Spoken Innovation Labs to give me a quick training about state of the art speech recognition and NLP for deep learning.

Speech Recognition

Speech recognition was the first application DL made a serious impact; the current state of the art approach in speech recognition is called CTC - "Connectionist Temporal Classification”. This approach is an end to end neural network that handles both the state classification and the temporal cases, where the HMM was used. In addition bidirectional LSTM are very hot topic.

Speech recognition is a field that has been in research for more than 40 years. Building a Speech recognition system is a huge algorithmic and engineering task. It is very hard to point on 3 specific research papers that can cover the whole topic. A nice paper I can refer to is "Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups", this is a research paper from 2012 published by 4 different research groups on the huge impact DL has made on speech recognition. Currently this is “old stuff” the technology is moving forward in a blazing paste. I also think that a book by top Microsoft Researchers “DEEP LEARNING: Methods and Applications” is a good place to understand what’s going on.

I do not know on any good video lectures on deep learning for voice, there might be, but to be honest I have not looked for that for a long time.

There are 2 well known open sources in speech recognition:

(i) Sphinx – Open source by CMU, quite easy to work with and start implementing speech recognition. However as far as I know, the downside is that is does not have Deep Learning support – Only GMM based models.

(ii) Kaldi – By far the most advanced open source in this field. Have al the latest technologies including: State of the art Deep learning models, WFST based search, advanced language model building techniques and latest speaker adaption techniques. Kaldi is not a plug and play program, it takes a lot of time to have a good understanding of how to use it and adapt it to your needs.

Natural Language Processing:

Natural Language Processing is a broad field with a lot of applications. So it is hard to point on a specific DL approach. Right now word representation and document/sentence representation using RNN are a secret sauce for building better models. In addition, a lot of NLP tasks are based on some kind of a sequence to sequence mapping so LSTM techniques give a nice boost to that. I also think that memory networks would have an interesting impact in the future.

DL is a tool to bring better NLP technologies, so I assume the big companies are applying these techniques to improve their product quality. A good example will be Google semantic search and IBM Watson.

Since NLP is a broad field with variety of application, its very hard to point for a single source. I think that Google’s TensorFlow offers a variety of interesting stuff, although it is not easy to work with. For word/document representation there is Google’s word2vec code, gensim and Stanford's Glove.Yishay Carmiel Short BioYishay is the head of Spoken Labs, a big data analytics unit that implements bleeding edge deep learning and machine learning technologies for speech recognition, computer vision and data analysis. He has 15 years' experience as an algorithm scientist and technology leader and has worked on building large scale machine learning algorithms and served as a deep learning expert. Yishay and his team are working on bleeding edge technologies in artificial intelligence, deep learning and large scale data analysis.

Saturday, December 12, 2015

As you may know I really like probabilistic graphical models, and my PhD Thesis was focused on Gaussian Belief Propagation. My Colleauge Alon Palombo have recently implemented a graphical model inference toolkit on top of GraphLab Create. We are looking for academic researchers or companies who would like to try it out.

About Me

6 years ago, along with my collaborators at Carnegie Mellon University, I have started the GraphLab large scale open source project, which is a framework for implementing machine learning algorithms in parallel and distributed settings. When the project became popular, we have decided to raise money to expand the project and provide an industry grade solution.
Specifically I wrote the award wining collaborative filtering toolkit to GraphLab which is widely deployed today, and helped us win top places at ACM KDD CUP 2011, ACM KDD CUP 2012 among other competitions.
Checkout our website: http://dato.com