Techniques for Predicting Bitcoin Price Action

Feature Extraction

Take a look at the original kernel for a more detailed explanation of the philosophy behind the feature extraction

importnumpyasnpimportpandasaspdimportdatetime,pytz# define a conversion function for the native timestamps in the csv filedefdateparse(time_in_secs):returnpytz.utc.localize(datetime.datetime.fromtimestamp(float(time_in_secs)))# read datadf=pd.read_csv('coinbaseUSD_1-min_data_2014-12-01_to_2017-10-20.csv.csv',parse_dates=[0],date_parser=dateparse)

The variables used as input are as follows:

O: Price at the start of snapshot which is fixed

C: Price at the close of snapshot which may equal O

H: Highest price recorded during the snapshot which may equal C and or O

L: Lowest price recorded during the snapshot which may equal C and or O

WgtPx, W: A derived price based on the ratio of value traded to volume traded (further reading)

Hence:

Change in WgtPX or V[t] – V[t-1]= f(HO[t], LO[t], CO[t], WO[t]),

where HO, LO, CO and WO are the relative distance of H, L, C and W from a fixed datum O[t]

As a hunch, I also included the day of the week and time of day as inputs to the network.

Basic ANN Setup

fromkeras.modelsimportSequentialfromkeras.layersimportDense,Dropout,Flattenfromsklearn.metricsimportclassification_reportann_model=Sequential()ann_model.add(Dense(32,activation='tanh',input_dim=6))ann_model.add(Dropout(0.2))ann_model.add(Dense(32,activation='tanh'))ann_model.add(Dropout(0.1))ann_model.add(Dense(32,activation='tanh'))ann_model.add(Dropout(0.1))ann_model.add(Dense(3,activation='softmax'))# out shaped on df_Yt.shape[1]ann_model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])

Using TensorFlow backend.

Training

Both models are trained on the data between Jan 1, 2015 and Dec 31, 2016.

The models are then validated using the data from Jan 1, 2017 to Oct 19, 2017.

Basic LTSM Setup

fromkeras.layersimportLSTMlstm_model=Sequential()lstm_model.add(LSTM(units=32,activation='tanh',input_shape=(None,6)))lstm_model.add(Dropout(0.2))lstm_model.add(Dense(units=3,activation='softmax'))# out shaped on df_Yt.shape[1]lstm_model.compile(loss='categorical_crossentropy',optimizer='adam',metrics=['accuracy'])

batch_size=60*24# Total 'blocks/snapshot' in a dayepochs=100# reshape data for LSTM networktrain_Xt_array=np.reshape(train_Xt_array,(train_Xt_array.shape[0],1,train_Xt_array.shape[1]))test_Xt_array=np.reshape(test_Xt_array,(test_Xt_array.shape[0],1,test_Xt_array.shape[1]))lstm_history=lstm_model.fit(train_Xt_array,train_y_array,epochs=epochs,batch_size=batch_size,verbose=1,validation_data=(test_Xt_array,test_y_array))

100 Epochs is admittedly a rather short training time but the results are rather promissing.

From the loss and accuracy curves, it seems that the ANN reaches a minimum quickly and after 50 epochs there aren’t signigicant gains. In contrast, the LSTM seems to still have a bit to go until it reaches its minimum.

Again, as stated by the original kernel, the structure of the networks is rather arbitrary and more for demonstration.

I’m pretty new to the price action prediction and even the machine learning field so there could certainly be errors. Let me know if you have any corrections, suggestions or comments!