This paper explores the socio-demographic variables (age, gender, ethnicity,
education, work status, and disability) and study environment (course programme and course
block), that may influence persistence or dropout of the distance education students at the
Open Polytechnic. It examines to what extent these factors, i.e. enrolment data help us in preidentifying
successful and unsuccessful students.
The data stored in the Open Polytechnic student management system from 2006 to
2009, covering over 450 students who enrolled to Information Systems course was used to
perform a quantitative analysis of study outcome. Based on a data mining techniques (such as
feature selection and classification trees) and logistic regression the most important factors
for student success and a profile of the typical successful and unsuccessful students are
identified.
The empirical results show the following: (i) the most important factors separating
successful from unsuccessful students are: ethnicity, course programme and course block; (ii)
among classification tree growing methods Classification and Regression Tree (CART) was
the most successful in growing the tree with an overall percentage of correct classification of
60.5%; (iii) both the risk estimated by the cross-validation and the gain diagram suggests that
all trees, based only on enrolment data, are not quite good in separating successful from
unsuccessful students, and (iv) the same conclusion was reached using the logistic regression.
The implications of these results for academic and administrative staff are discussed.