Deep Multimodal Semantic Embeddings for Speech and Images, MIT Computer Science and Artificial Intelligence Laboratory. The authors develop a model that learns spoken words associated with images using a pair of CNNs, along with an alignment and embedding model. This helps image search and annotation tasks.

Playfair Capital is actively seeking out entrepreneurs building companies that change the way we live, work and play using deep learning, machine learning, artificial intelligence, computer vision, speech and NLP.