This paper suggests a way of using both historical prices and text data together for financial time series prediction. They call it Stocknet. There seem to be 2 major contributions here: (a) Encoding both market data and text data together, (b) VAE (Variational AutoEncoder) inspired generative model.

TLDR

RNN-based variational autoencoder along with attention is used to predict whether the stock price will go up or down.

Market Information Encoder (MIE)

This component is relatively straightforward. Tweets for the given day are combined into the vector . Historical prices are normalized and stored in the vector . The output of this component (MIE) is the vector .

Variational Movement Decoder (VMD)

VMD uses the market information received from the previous component and infers a latent factor . This latent vector is then decoded into vector using an RNN decoder with GRU cells.

Attentive Temporal Auxiliary (ATA)

Attention is applied to the outputs from the previous component. Both VAE and Attention components are combined to construct the final loss function .

Here, is the attention weight vector and is the loss function from the variational autoencoder component.

is the log-likelihood term, is the KL divergence loss and is the KL loss weight. is increased over time during training. It’s known as KL annealing trick Bowman et al., 2016.

TECHNICALANALYST, FUNDAMENTALANALYST, INDEPENDENTANALYST, DISCRIMINATIVEANALYST and HEDGEFUNDANALYST are simply different variants of their StockNet model with HEDGEFUNDANALYST being the model described above.