A Novel Multidimensional Approach to Integrate Big Data in Business Intelligence

Alejandro Maté (Department of Software and Computing Systems, University of Alicante, Alicante, Spain), Hector Llorens (Department of Software and Computing Systems, University of Alicante, Alicante, Spain), Elisa de Gregorio (Department of Software and Computing Systems, University of Alicante, Alicante, Spain), Roberto Tardío (Department of Software and Computing Systems, University of Alicante, Alicante, Spain), David Gil (Department of Computing Technology and Data Processing, University of Alicante, Alicante, Spain), Rafa Muñoz-Terol (Department of Software and Computing Systems, University of Alicante, Alicante, Spain) and Juan Trujillo (Department of Software and Computing Systems, University of Alicante, Alicante, Spain)

Abstract

The huge amount of information available and its heterogeneity has surpassed the capacity of current data management technologies. Dealing with huge amounts of structured and unstructured data, often referred as Big Data, is a hot research topic and a technological challenge. In this paper, the authors present an approach aimed to enable OLAP queries over different, heterogeneous, data sources. Their approach is based on a MapReduce paradigm, which integrates different formats into the recent RDF Data Cube format. The benefits of their approach are that it is capable of querying different sources of information, while maintaining at the same time, an integrated, comprehensive view of the data available. The paper discusses the advantages and disadvantages, as well as the implementation challenges that such approach presents. Furthermore, the approach is evaluated in detail by means of a case study.

Article Preview

Related Work

In this section, we present the different technologies related to our proposal. First, Big Data and distributed architectures are reviewed. Then, we analyze Linked Data, RDF and SPARQL standards. Afterwards, we discuss the current Data warehouse (DW) and multidimensional (MD) modeling proposals. Finally, the contribution of the paper w.r.t. the current state-of-the-art is summarized.