“The idea behind this work was to define a new algorithm that would allow us to cluster similar curves in order to make decisions based on these clustering”, said Nora M. Villanueva, researcher in Services and Applications department at Gradiant, who recognizes with an example that “every organization works with a huge amount of curves of different elements, such as the operation of a machine, customers who leave a particular service or work life of pieces that are produced in a factory. Our algorithm allows us to group these curves by resemblance, showing which elements perform in a similar way.

An innovative project in clustering techniques

Nowadays, clustering techniques allow grouping curves according to the number of groups that have been defined. “The innovation of our algorithm is that, in addition to making this cluster, we can know -with statistical significance- how many different groups there are”, said Marta Sestelo, researcher in the INetS department at Gradiant. In fact, this is the most important and differentiating characteristic of their tool. Until now, groups were chosen according to the subjective and non-automatic criteria of each researcher. In addition, this methodology is implemented in a R library, an open source programming language available to everyone who needs it, such as the scientific community or other organizations.

A transversal management tool

Results of the project have a direct application in different sectors where it is necessary to estimate the probability of an event occurring in a specific period of time. Banking, insurance or any company operating within the Industry 4.0 sector can also benefit from this project, as it could group time curves up to the event, being this a failure of a piece, customer delays or crop mortality in a fish farm, for example.

In addition, this project also has a significant place in other areas such as medicine and education. “We can apply it in our daily work with the different technologies we are experts at Gradiant, such as eLearning projects applied to classrooms where we want to investigate dropout of students in a particular course,” said Nora M. Villanueva.

As a result of the project’s versatility, other international institutions have also become interested, such as the prestigious ‘Statistics in Medicine’ journal specialized in statistics and probability. At the moment, the work has already received the recognition of the entire eRum 2018 team, an international event that was attended by more than 500 professionals from 19 different countries last May to follow the conferences and presentations of more than thirty speakers from different Universities and internationally renowned companies such as Rstudio, Microsoft, H2o.ai or Mango Solutions.