I'm a student that is new to this field, I've played with the GUI version of Weka and made Neural Nets in that with premade datasets but now is the first time I've implemented one using Keras (Theano Backend) in Python. What I'm trying to do is create a model that will find a correlation between Tweet sentiment and Stock price of a single company. In this case study, I decided to use Tesla.

Here is a sample of what data I'm collecting in my DB.

Now I only using the Tweet sentiment and the stock price for my NN, What I'm using the NN for is Multi-Classification to tell me whether the stock is going up or down or staying the same. here is my code

When I get the data from using "SELECT * FROM tweets" becuase of the market is only open a certain time period the majority of the data stocks don't move so I get a 0.0 percent change, so for this post I have included the "WHERE stock_price 1= 301.44"

I then have my predictions array (pred) and then my results which I delete everything apart from the sentiment as I have already got what I needed from the stock_price which is the one hot encoding tell whether it is going up or down.

I then fit pred & results into the model via the numpy.as_matrix() method.

I have added a few Dropout layers as my model is very over fitted I kept getting 95% accuracy this is very unlikely, I told my lecture and she said this.

"That is ok, but again, you need to validate your machine learning
models using k-fold cross-validation and testing it using a separate
validation data set."

I have used the validation_split param in the model.fit() method is this what she meant? also what is K-Fold CV? I know what Cross Validation is but not the K-Fold part?