6Association Analysis:Basic Concepts andAlgorithmsMany business enterprises accumulate large quantities of data from their day-to-day operations. For example, huge amounts of customer purchase data arecollected daily at the checkout counters of grocery stores. Table 6.1 illustratesan example of such data, commonly known asmarket basket transactions.Each row in this table corresponds to a transaction, which contains a uniqueidentifier labeledTIDand a set of items bought by a given customer. Retail-ers are interested in analyzing the data to learn about the purchasing behaviorof their customers. Such valuable information can be used to support a vari-ety of business-related applications such as marketing promotions, inventorymanagement, and customer relationship management.This chapter presents a methodology known asassociation analysis,which is useful for discovering interesting relationships hidden in large datasets. The uncovered relationships can be represented in the form ofassocia-Table 6.1.An example of market basket transactions.TIDItems1{Bread, Milk}2{Bread, Diapers, Beer, Eggs}3{Milk, Diapers, Beer, Cola}4{Bread, Milk, Diapers, Beer}5{Bread, Milk, Diapers, Cola}

This preview has intentionally blurred sections.
Sign up to view the full version.

328Chapter 6Association Analysistion rulesor sets of frequent items. For example, the following rule can beextracted from the data set shown in Table 6.1:{Diapers} −→ {Beer}.The rule suggests that a strong relationship exists between the sale of diapersand beer because many customers who buy diapers also buy beer. Retailerscan use this type of rules to help them identify new opportunities for cross-selling their products to the customers.Besides market basket data, association analysis is also applicable to otherapplication domains such as bioinformatics, medical diagnosis, Web mining,and scientific data analysis. In the analysis of Earth science data, for example,the association patterns may reveal interesting connections among the ocean,land, and atmospheric processes. Such information may help Earth scientistsdevelop a better understanding of how the different elements of the Earthsystem interact with each other. Even though the techniques presented hereare generally applicable to a wider variety of data sets, for illustrative purposes,our discussion will focus mainly on market basket data.There are two key issues that need to be addressed when applying associ-ation analysis to market basket data. First, discovering patterns from a largetransaction data set can be computationally expensive. Second, some of thediscovered patterns are potentially spurious because they may happen simplyby chance.The remainder of this chapter is organized around these two is-sues. The first part of the chapter is devoted to explaining the basic conceptsof association analysis and the algorithms used to eﬃciently mine such pat-terns. The second part of the chapter deals with the issue of evaluating thediscovered patterns in order to prevent the generation of spurious results.

This is the end of the preview.
Sign up
to
access the rest of the document.

What students are saying

As a current student on this bumpy collegiate pathway, I stumbled upon Course Hero, where I can find study resources for nearly all my courses, get online help from tutors 24/7, and even share my old projects, papers, and lecture notes with other students.

Kiran
Temple University Fox School of Business ‘17, Course Hero Intern

I cannot even describe how much Course Hero helped me this summer. It’s truly become something I can always rely on and help me. In the end, I was not only able to survive summer classes, but I was able to thrive thanks to Course Hero.

Dana
University of Pennsylvania ‘17, Course Hero Intern

The ability to access any university’s resources through Course Hero proved invaluable in my case. I was behind on Tulane coursework and actually used UCLA’s materials to help me move forward and get everything together on time.