AI and deep machine learning are electrifying the computing industry and will soon transform corporate America. This is an excellent non-technical overview of what Deep Learning is and how it came to be.

Amazon, DeepMind, Google, Facebook, IBM, and Microsoft just established the Partnership on AI. It’s goal is to study and formulate best practices on AI technologies, to advance the public’s understanding of AI, and to serve as an open platform for discussion and engagement about AI.

Now you can finally train your Deep Learning models in the cloud. This new instance type incorporates up to 8 NVIDIA Tesla K80 Accelerators, each running a pair of NVIDIA GK210 GPUs. Pricing ranges from $0.9/h (1 GPU) to $14.4/h (16 GPUs). Cloud is nice, but at this price point it seems like most people would be better off investing $1-2k into buying their own GPUs.

Artificial intelligence is all the rage in tech circles. As always, the reality of building a startup is different, especially when one aims to build a self-standing company for the long term. Great advice on why building an AI startup isn’t as easy as the press makes it out to be.

Google Neural Machine Translation system (GNMT) achieves the largest improvements to date for machine translation quality. It’s now used by Google Translate for Chinese to English translation. Read the full paper here.

YouTube-8M is a large-scale labeled video dataset that consists of 8 million YouTube video IDs and associated labels from a diverse vocabulary of 4800 visual entities. It also comes with precomputed state-of-the-art vision features from billions of frames.

DeepBench is an open source benchmarking tool that measures the performance
of basic operations involved in training deep neural networks. These operations are
executed on different hardware platforms using neural network libraries. Deepbench is
available as a repository on github.

Google’s Neural Machine Translation system consists of a deep LSTM network with 8 encoder and 8 decoder layers using attention and residual connections. It reduces translation errors by an average of 60% compared to Google’s phrase-based production system. TLDR; This paper is a bag of tricks on how to implement NMT in practice.

The Pointer sentinel mixture architecture for neural sequence models has the ability to either reproduce a word from the recent context or produce a word from a standard softmax classifier. It achieves state of the art language modeling performance on the Penn Treebank (70.9 perplexity) while using fewer parameters than a standard softmax LSTM. By Stephen Merity, Caiming Xiong, James Bradbury, Richard Socher (MetaMind)

This work explores hypernetworks: an approach of using a small network, also known as a hypernetwork, to generate the weights for a larger network. Hypernetworks provide an abstraction that is similar to what is found in nature: the relationship between a genotype - the hypernetwork - and a phenotype - the main network. By David Ha, Andrew Dai, Quoc V. Le (Google Brain)

Emoji2vec: Pre-trained embeddings for all Unicode emoji which are learned from their description in the Unicode emoji standard. The resulting emoji embeddings can be readily used in downstream social natural language processing applications alongside word2vec. By Ben Eisner, Tim Rocktäschel, Isabelle Augenstein, Matko Bošnjak, Sebastian Riedel.