We propose a simple duality between this dense associative
memory and neural networks commonly used in deep learning. The
proposed duality makes it possible to apply energy-based intuition from associative memory to analyze
computational properties of neural networks with unusual activation functions – the higher
rectified polynomials which until now have not been used for training neural networks

The
proposed method is related to holographic models of associative
memory in that it employs circular correlation to create
compositional representations. By using correlation as the
compositional operator, HOLE can capture rich interactions
but simultaneously remains efficient to compute, easy to train,
and scalable to very large datasets.

We propose an extended model of active memory that matches existing attention models on neural machine translation and generalizes better to longer sentences. We investigate this model and explain why previous active memory models did not succeed. Finally, we discuss when active memory brings most benefits and where attention can be a better choice.

https://arxiv.org/pdf/1804.01756.pdf The Kanerva Machine: A Generative Distributed Memory
We present an end-to-end trained memory system that quickly adapts to new data and generates samples like them. Inspired by Kanerva's sparse distributed memory, it has a robust distributed reading and writing mechanism. The memory is analytically tractable, which enables optimal on-line compression via a Bayesian update-rule. We formulate it as a hierarchical conditional generative model, where memory provides a rich data-dependent prior distribution. Consequently, the top-down memory and bottom-up perception are combined to produce the code representing an observation. Empirically, we demonstrate that the adaptive memory significantly improves generative models trained on both the Omniglot and CIFAR datasets. Compared with the Differentiable Neural Computer (DNC) and its variants, our memory model has greater capacity and is significantly easier to train.

https://github.com/jgpavez/Working-Memory-Networks The Working Memory Network is a Memory Network architecture with a novel working memory storage and relational reasoning module. The model retains the relational reasoning abilities of the Relation Network while reducing its computational complexity considerably. The model achieves state-of-the-art performance in the jointly trained bAbI-10k dataset, with an average error of less than 0.5%.