Friday, January 1, 2010

We, the team of Mooshabaya (De Alwis.K.D.B.C, Malinga.A.S, Pradeeban.K, Weerasiri.W.A.D.D.) chose "Association Rule Mining with Extended Vertical Format Data Mining" as our Advanced Database CS4420 module research project. The project proposal can be found here. The research paper we submitted is given below.

Abstract

Analyzing the data warehouses to foresee the patterns of the transactions of the businesses and scientific infrastructures often needs high computational power and a high memory space due to the huge set of past history of data transactions. With the fragmented data along with the current trend of distributed systems, most of the fundamental algorithms that are initially proposed to find the association among the itemsets in the data warehouses are inefficient either in throughput or the utilization of the resources.

Apriori algorithm is such an algorithm which was proposed to mine the data warehouses to find the associations. Apriori, though being the mostly learned and implemented algorithm for data mining, it is generally not an optimized algorithm. More variations, improvements, and alternatives have been suggested to overcome the inefficiency of Apriori algorithm, either as a whole or to a particular specific set of data. In either case a fraction of improvement in the algorithm often improves the mining considerably. Vertical Format Data mining is one of the efficient alternatives to Apriori algorithm. In this paper we are proposing an algorithm as an alternative to Apriori algorithm, which will use bitmap indices in conjunction with vertical format data mining. The implementation of the proposed algorithm is benchmarked with an implementation of Apriori Algorithm against a chosen set of benchmarks, which is supposed to be more efficient than its predecessors.

Those who were here..

Pradeeban Kathiravelu is a distributed systems researcher. He holds a Ph.D. double degree, Erasmus Mundus Joint Doctorate in Distributed Computing (EMJD-DC), from INESC-ID Lisboa / Instituto Superior Técnico, Universidade de Lisboa, Portugal and Université catholique de Louvain, Belgium. He also holds a Master of Science degree, Erasmus Mundus European Master in Distributed Computing (EMDC), from Instituto Superior Técnico, Portugal, and KTH Royal Institute of Technology, Sweden, and a BSc (Eng) Computer Science & Engineering from University of Moratuwa, Sri Lanka.
His research interests include Distributed Systems, Network Softwarization, Software-Defined Systems, Cloud-Assisted Networks, Big data Integration, Internet Measurements, and Service-Oriented Architecture. He is highly interested in free and open source software development, and is an active participant of the Google Summer of Code (GSoC) program since 2009, as a student and as a mentor.