In november-december of 2018 our team has participated in Kaggle University Club Winter Hackathon. As a result, we were the winners of the competition, along with teams from Penn State University and Hanyang University. Here is what we’ve done so far.

Problem Set

Our approach

Dataset provides several numerical and categorical values for each review. The rating was chosen as target value to predict. Also, we decided to use only the text data(patient review itself). Main reason was to prevent major data leak and create really useful model.

We combine those models into ensemble of 10 networks of each type. Training is done with MAE as criterion, Adam optimizer and Early Stopping regularization. To comply with competition rules, all the computation was done on Kaggle Kernels(with GPU support, obviously).