We explore the properties of byte-level recurrent language models. When given
sufficient amounts of capacity, training data, and compute time, the
representations learned by these models include disentangled features
corresponding to high-level concepts. Specifically, we find a single unit which
performs sentiment analysis. These representations, learned in an unsupervised
manner, achieve state of the art on the binary subset of the Stanford Sentiment
Treebank. They are also very data efficient. When using only a handful of
labeled examples, our approach matches the performance of strong baselines
trained on full datasets. We also demonstrate the sentiment unit has a direct
influence on the generative process of the model. Simply fixing its value to be
positive or negative generates samples with the corresponding positive or
negative sentiment.

Captured tweets and retweets: 2

Made with a human heart + one part enriched uranium + four parts unicorn blood