Is Big Data delivering on its Promise for Education?

Big data is the term used to describe the collection and analysis of data sets, which are so large and
complex that conventional database systems are inadequate, to discover patterns or similarities that can be learned from to make better decisions. It’s been one of the hottest buzz words for the last number of years, and has become pervasive in our everyday lives. In fact, to the surprise of many Gartner dropped big data completely off their Hype Cycle for Emerging Technologies in 2015 specifically because of how prevalent it has become in the last year.

As more of our everyday activities move online, we generate increasing amounts of data and leave behind a digital footprint or “data exhaust” of our movements, habits, and preferences. Every single day we generate 2.5 quintillion (that’s 18 zeros!) bytes of data. To put that in physical terms, 2.5 quintillion bytes would fill 10 million blu-ray discs, which if stacked one on top of each other would measure three times the height of the Empire State Building. The rate of growth is equally phenomenal, with an estimated 95% of the world’s data created in the last three years.

Big data has promised big changes in the education world.

Justification of Investment

More and more individuals, institutions and corporations are turning to e-learning, both for its increased efficacy as well as its cost effectiveness. The global e-learning market is forecasted to reach US$240 billion by 2023. However, as with any expense, the efficacy of the resource must be proven in order to justify continued expenditure. Big data can help us more accurately identify things like: which educational activities are the most successful, how effective a single piece of content is across a group of students, student proficiency gains when a student fitting a particular profile takes a certain learning path, how well a question assesses a particular competency, the impact of training initiatives on an organization’s bottom line, etc. Ultimately, big data promotes data-driven decisions.

Identifying those at Risk

One of the big potential benefits of analytics in education is the promise that we can catch students at risk – students who might otherwise fall through the gaps. By tracking individual and group performance by subject, and against specific curriculum standards, student weaknesses can be identified early and corrective action taken as a result.

An initiative by Mobile County School, Alabama used big data to predict students at risk of dropping out. Using data from across the entire school system, including attendance records, test scores, and disciplinary histories allowed the school system to gain a unique insight into their students. When they analyzed the data sets they found that distinct trends emerged, with suspensions and serial absences often preceding the eventual drop out of a student. This knowledge enabled the school to launch a targeted outreach program for these students, which decreased the dropout rate by 33%.

In a similar manner, data analysis at the Washburn University in Topeka Kansas revealed that students who had campus accommodation were less likely to drop out than those living in off-campus accommodation. As a result, to increase retention rates, the university is looking to add more on-campus accommodation.

Identifying Best Practices

Tracking the minutiae of a learner’s activity; how long they spent on each page of content, whether they went back to review certain sections of content at any stage, how long they took to answer a question, whether or not they changed their original answer, even how long they moused over the incorrect answer before finally clicking on the correct one, can help build up a detailed model of that individual student’s particular strengths and weaknesses as well as a predictive outcome for the student. Educators can determine how the most successful students are working and identify some of the many variable factors that contribute to learner success. These patterns can be used to inform instruction and encourage other students to follow similar patterns and best practices. In a similar vein, we can also identify which course materials and activities are particularly effective (and indeed ineffective) and encourage curriculum specialists to use these as best practice examples.

Personalized Learning

We’re all familiar with the Amazon purchasing experience: buy any product and get recommendations for other products that users who purchased the same product as you also purchased. Amazon collates information from wish lists, purchasing history as well as browsing history to create these personalized product suggestions, using evidence-based consumer behavior to inform decisions rather than gut instinct.

In the same way big data is used to personalize retail offerings such as Amazon or Netflix, it can also be applied to the educational experience. Based on a student’s progress and performance at any given stage, a personalized curriculum can then recommend a series of subsequent actions for the student be it additional information, further corrective study, progression to the next stage of the curriculum etc.

The Challenges

Despite the many promises of big data in education, there is one overarching problem which has meant that big data has not fully delivered on its full potential in education. It’s not the lack of data, nor that we don’t know how to use it, but that it is not always accessible to those on the front line of education.

The real challenge with big data in the education world is the curation and display of the data to display meaningful, actionable insights to those on the ground. Sure, processing and interpreting big data is hard – that’s why it’s called big data!

However, in order for big data to really deliver on its full potential, the results and interpretations need to be accessible to more than just psychometricians and statisticians. We need to both increase the level of reporting and analytics available to our instructors in their LMSs or grade books and ensure that the data is presented in an accessible format so that that they can make evidence-based decisions.