After we discussed the features above, we had the idea to use multiple algorithms for predicting out output variable (probability of success for first try for this student-step): one algorithm predicting student success, one for step difficulty, perhaps additional ones... then have an aggregate function learn the overall success probability. Since we have about 3000 steps per student, we should have enough data to train a model for each student.

We agreed that the accepted features above generally make sense to us; to define the set of features for the individual multi-algo problems is yet TODO.