6 Data Mining Models in Oracle Retail Data Model

This chapter provides reference information about the data mining models that are provided with Oracle Retail Data Model if you do choose to install the Data Mining Option. "About Data Mining in Oracle Retail Data Model" provides overview information. The rest of this chapter describes the following data mining models in more detail:

Each model topic provides the following types of information: a description of the model, examples of desired rules, a discussion of what the discovered rules explain, a discussion of what the model mines, a list of the target variables, a list of columns included in the target views, and sample reports.

About Data Mining in Oracle Retail Data Model

Oracle Retail Data Model includes data mining packages. The data mining portion of Oracle Retail Data Model consists of source tables that are populated by detail data for use by the data mining packages. This data is organized in a specific way to be compatible with the data mining modules so they can properly analyze and mine the data. Data mining packages pull in the source data and feed it into the data mining packages, and populate the target tables with the results. The data in the target tables are presented by the OBIEE reports.

Tip:

Changed or new data models are not supported by Oracle Retail Data Model. Consequently, do not change the data models that are defined and delivered with Oracle Retail Data Model, but, instead, copy a delivered data model to create a new one.

Associate Basket Analysis Model

This model addresses the business problem of building a profile of associates to explain their basket KPIs, such as Total Baskets, Average Basket Value, and other statistics.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
SALARY ELIGIBLITY is (N)
AND
EMPLOYEE_TYPE is (TEMPORARY)
THEN
NUMBER OF BASKETS IS THE HIGHEST

Desired Rules Example 2

IF
SALARY ELIGIBLITY is (N)
AND
EMPLOYEE_TYPE is (TEMPORARY)
THEN
NUMBER OF BASKETS IS THE LOWEST

What the Discovered Rules for the Associate Basket Analysis Model Report Explain

The discovered rules provide correlations between the basket KPIs and associate (employee) attributes.

What the Associate Basket Analysis Model Mines

This analysis identifies which key attributes of an associate influence his or her number of baskets sold, average basket value, and size. This model mines the various attributes of associates. It takes the binned variables one at a time for the Total Basket Count, Average Basket Value, and Average Basket Size as the target variable of an Adaptive Bayes Network (ABN) and Decision Tree (DT) model with a single feature and discovers rules described in terms of associate attributes.

Target Variables for the Associate Basket Analysis Model

The rules are designed to be generated monthly. Therefore, nine ABN and nine DT models are created every month across all the associates using the following variables as targets in this order:

Total Basket Count Quartile (TBCQR)

Total Basket Count Quintile (TBCQN)

Total Basket Count Decile (TBCDE)

Average Basket Value Quartile (ABVQR)

Average Basket Value Quintile (ABVQN)

Average Basket Value Decile (ABVDE)

Average Basket Size Quartile (ABSQR)

Average Basket Size Quintile (ABSQN)

Average Basket Size Decile (ABSDE)

Note:

Associates are grouped into N-Tiles according to their sales performance figures.

Source Variables for the Associate Basket Analysis Model

The following attributes of associates are identified from the Oracle Retail Data Model data warehouse tables as source variables for the models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

Case Id Alt (PK)

Year

Month

Employee Id

Designation Name

Designation Title

Designation Level

Nationality

Gender

Marital Status

Age

Net Income

Demographics Code

Title

Total Months of Job

Employee Type

Correspondence Language

Disability Indicator

Rehire Recommendation Indicator

HR Based Salary Eligibility Indicator

Overtime Hours Salary Eligibility Indicator

Commission Eligibility Indicator

SPIFF Allowed Flag

Total Hours Worked

Total Overtime Hours

Total Basket Count Quartile (Target)

Total Basket Count Quintile (Target)

Total Basket Count Decile (Target)

Average Basket Value Quartile (Target)

Average Basket Value Quintile (Target)

Average Basket Value Decile (Target)

Average Basket Size Quartile (Target)

Average Basket Size Quintile (Target)

Average Basket Size Decile (Target)

Columns Included in the Target Views of the Associate Basket Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Rule ID (PK)

Performance Measure

Measure Value

Associate Profile

Prediction Count

Record Count

Support

Confidence

Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and are also displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Performance Measure

Attribute Name (PK)

Attribute Type

Associate Loss Analysis Model

This model addresses the business problem of correlating associate characteristics to shrink and theft.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
DESIGNATION is (TEMPORARY CASHIER)
AND
COMMISSION_ELIGIBILITY is (Y)
THEN
SHRINK TO SALES RATIO IS THE HIGHEST

Desired Rules Example 2

IF
DESIGNATION is (TEMPORARY CASHIER)
AND
COMMISSION_ELIGIBILITY is (N)
THEN
SHRINK TO SALES RATIO IS THE LOWEST

What the Discovered Rules of the Associate Loss Analysis Model Explain

The discovered rules explain the reasons for the shrink and theft associated with an Associate.

What the Associate Loss Analysis Model Mines

This model mines the Total Shrink Count, Total Shrink Amount, Shrink as a percentage of Sales, Total Theft Count, Total Theft Amount and Theft as a percentage of Sales of individual associates to identify which of their key attributes influence their shrinkage and theft.

This model takes the binned variables one at a time for Total Shrinkage and Theft Count or Value or percentage of Sales as the target variable of an Adaptive Bayes Network (ABN) and Decision Tree (DT) model and discovers rules described in terms of associate attributes.

Target Variables for the Associate Loss Analysis Model

The rule are designed to be generated monthly. Therefore, eighteen ABN models and eighteen DT models are created every month across all the associates using the following variables as targets in this order:

Total Shrink Count Quartile (TSCQR)

Total Shrink Count Quintile (TSCQN)

Total Shrink Count Decile (TSCDE)

Total Shrink Amount Quartile (TSAQR)

Total Shrink Amount Quintile (TSAQN)

Total Shrink Amount Decile (TSADE)

Shrink as a percentage of Sales Quartile (STSQR)

Shrink as a percentage of Sales Quintile (STSQN)

Shrink as a percentage of Sales Decile (STSDE)

Total Theft Count Quartile (TTCQR)

Total Theft Count Quintile (TTCQN)

Total Theft Count Decile (TTCDE)

Total Theft Amount Quartile (TTAQR)

Total Theft Amount Quintile (TTAQN)

Total Theft Amount Decile (TTADE)

Theft as a percentage of Sales Quartile (TTSQR)

Theft as a percentage of Sales Quintile (TTSQN)

Theft as a percentage of Sales Decile (TTSDE)

Source Variables for the Associate Loss Analysis Model

The following attributes of associates are identified from the data warehouse tables as source variables for the models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

Case Id Alt (PK)

Year

Month

Employee Id

Designation Name

Designation Title

Designation Level

Nationality

Gender

Marital Status

Age

Net Income

Demographics Code

Title

Total Months of Job

Employee Type

Correspondence Language

Disability Indicator

Rehire Recommendation Indicator

HR Based Salary Eligibility Indicator

Overtime Hours Salary Eligibility Indicator

Commission Eligibility Indicator

SPIFF Allowed Flag

Total Hours Worked

Total Overtime Hours

Total Shrink Count Quartile (Target)

Total Shrink Count Quintile (Target)

Total Shrink Count Decile (Target)

Total Shrink Amount Quartile (Target)

Total Shrink Amount Quintile (Target)

Total Shrink Amount Decile (Target)

Shrink as a percentage of Sales Quartile (Target)

Shrink as a percentage of Sales Quintile (Target)

Shrink as a percentage of Sales Decile (Target)

Total Theft Count Quartile (Target)

Total Theft Count Quintile (Target)

Total Theft Count Decile (Target)

Total Theft Amount Quartile (Target)

Total Theft Amount Quintile (Target)

Total Theft Amount Decile (Target)

Theft as a percentage of Sales Quartile (Target)

Theft as a percentage of Sales Quintile (Target)

Theft as a percentage of Sales Decile (Target)

Columns Included in the Target Views of the Associate Loss Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Rule ID (PK)

Performance Measure

Measure Value

Associate Profile

Prediction Count

Record Count

Support

Confidence

Rule Display Order

The Model Signature Target View outlines the attribute structure of the model (built using an ABN or DT algorithm). The Model Signature Target View contains the following columns and can be displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Performance Measure

Attribute Name (PK)

Attribute Type

Associate Sales Analysis Model

This model addresses the business problem of profiling associate characteristics to sales, cost, and profit patterns.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

Examples of Desired Rules for the Associate Sales Analysis Model

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
ASSOCIATE IS NOT ELIGIBLE FOR SPIFF
AND
ASSOCIATE IS ELIGIBLE FOR SALARY
AND
ASSOCIATE IS NOT ELIGIBLE FOR COMMISSION
THEN
ASSOCIATE PROFIT IS THE LOWEST

Desired Rules Example 2

IF
ASSOCIATE IS ELIGIBLE FOR SPIFF
AND
ASSOCIATE IS NOT ELIGIBLE FOR SALARY
AND
ASSOCIATE IS NOT ELIGIBLE FOR COMMISSION
THEN
ASSOCIATE PROFIT IS THE HIGHEST

What the Discovered Rules of the Associate Sales Analysis Model Explain

The discovered rules provide correlations between associate (employee) characteristics and their sales, cost, and profit profiles.

What the Associate Sales Analysis Model Mines

This model mines the various attributes of associates. It takes the binned variables one at a time for Sales, Costs, and Profits as the target variable of an Adaptive Bayes Network (ABN) and Decision Tree (DT) model with a single feature and discovers rules described in terms of associate attributes.

Target Variables for the Associate Sales Analysis Model

The rules are designed to be generated monthly. Therefore, nine ABN models and nine DT models are created every month across all the associates using the following variables as targets in this order:

Sales Amount Quartile (SAQR)

Sales Amount Quintile (SAQN)

Sales Amount Decile (SADE)

Cost Amount Quartile (CAQR)

Cost Amount Quintile (CAQN)

Cost Amount Decile (CADE)

Profit Amount Quartile (PADR)

Profit Amount Quintile (PAQN)

Profit Amount Decile (PADE)

Source Variables for the Associate Sales Analysis Model

The following attributes of associates are identified from the data warehouse tables as source variables for the models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

Case Id Alt (PK)

Year

Month

Employee Id

Designation Name

Designation Title

Designation Level

Nationality

Gender

Marital Status

Age

Net Income

Demographics Code

Title

Total Months of Job

Employee Type

Correspondence Language

Disability Indicator

Rehire Recommendation Indicator

HR Based Salary Eligibility Indicator

Overtime Hours Salary Eligibility Indicator

Commission Eligibility Indicator

SPIFF Allowed Flag

Total Hours Worked

Total Overtime Hours

Columns Included in the Target Views of the Associate Sales Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

Analysis Name (PK)

Year (PK)

Analysis Desc

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Rule ID (PK)

Performance Measure

Measure Value

Associate Profile

Prediction Count

Record Count

Support

Confidence

Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and is also displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Performance Measure

Attribute Name (PK)

Attribute Type

Customer Category Mix Analysis Model

This model addresses the business problem of discovering product categories that are frequently bought by customers. The model is used to understand the Categories purchased by a Customer in a typical transaction in terms of the components like the Categories in the Basket, Target Category in a Basket and additional information like Basket Significance (Sales Value), Target Category Significance (Sales Value) which are generated from regular Customer Transactional data.

Using Oracle Data Mining, the KPIs are modeled with the APRIORI algorithm utilised by the Association Rules model. The model type used for Association Rules with Apriori Algorithm is APASS. This model type is an example of Unclassified Learning since the Categories (or Target Category) which constitute the Category Basket are not inferred or guided (as part of data preparation) but are generated by the model itself.

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
CUSTOMER HAS BOUGHT 'BABY', 'GRAB AND GO'
THEN
CUSTOMER IS LIKELY TO BUY 'PACKAGED BEVERAGES' (11 Support: 36%, Probability:56%)

Category Basket Significance of ('BABY', 'GRAB AND GO', 'PACKAGED BEVERAGES') is 45% of Sales Value => The Sales from the 3 categories in Category Basket ('BABY', 'GRAB AND GO', 'PACKAGED BEVERAGES') account for 45% of the Total Sales across all categories in that particular store.

The Category Basket Significance (Sales Value) KPI allows us to filter out Rules which may be insignificant from a Basket Sales Value perspective.

Target Category Significance of ('PACKAGED BEVERAGES') is 60% of the Basket Sales Value => The Sales from the Target Category ('PACKAGED BEVERAGES') account for 60% of the Total Sales from the Category Basket ('BABY', 'GRAB AND GO', 'PACKAGED BEVERAGES') in that particular store.

The Target Category Significance (Sales Value) KPI allows us to filter out Rules determining insignificant Customer Purchases (insignificant Target Category). In other words, it helps us to extract Rules which relate to singificant Customer Purchases, where the Target Category is significant within the Basket of Categories (from a Sales Value perspective). Identifying this information is useful from a campaign/promotion/upsell perspective.

What the Discovered Rules for the Customer Category Mix Analysis Model Explain

The discovered rules explain customer behavior and buying patterns regarding various product categories. They help indicate the groups of product categories that sell well together, for example:

The chances of a customer buying a BABY product increases from 11% to 62% if that customer purchases FLORAL and HOT FOODS products. This Rule is 28% significant in terms of Basket Value (Sales Value - all 3 categories in Basket) and The Target Category (BEAUTY) significance is about 75% of the Basket.

The chances of a customer buying a BEAUTY product is 34% if that customer has purchased a product from categories 'HEALTH', 'MAGAZINES' and 'PHARMACY'. This Rule is about 66% significant in terms of Basket Value (Sales Value of all 4 categories in Basket) and The Target Category (BEAUTY) significance is about 45% of the Basket.

BEAUTY products are very rarely sold together (about 3% of the time) with LIQUOR and AUTOMOTIVE PRODUCTS.

The significance of a rule can be measured in terms of support and confidence and a host of additional supporting measures; for example, Basket Significance (Value), Target Category Signifcance (Value), Basket Significance (Customers), Target Category Signifcance (Customers), Basket Significance (Transactions), Target Category Signifcance (Transactions), and so on..

What the Customer Category Mix Analysis Model Mines

This model mines the monthly purchases of individual customers and discovers rules about the categories that are frequently bought in groups by customers.

Target Variables for the Customer Category Mix Analysis Model

The purchase patterns are designed to be generated monthly for each individual store. Therefore, the APASS models are created every month for each store.

Source Variables for the Customer Category Mix Analysis Model

The following item attributes are the source variables:

Case Id Alt (PK)

Store ID

Year

Month

ID

Name

Value

Note:

If the mining must be performed at multiple levels, such as category, subcategory, item, there may be multiple source tables.

Columns Included in the Target Views for the Customer Category Mix Analysis Model Report

The mined patterns and rules are visible through the target view CUST_CATEGORY_MIX_APASS_RULES with the following columns and can be displayed in an OBIEE report:

STORE_ID - Store ID

YEAR - Year

MONTH - Month

MODEL_NAME - Model Name (CCM_MDL_APASS_<Store_ID>)

MODEL_TYPE - Model Type (APASS)

MODEL_TYPE_DESC - Model Description (Apriori Association)

ANALYSIS_NAME - Analysis Name (CUST_CATEGORY_MIX)

ANALYSIS_DESC - Analysis Description (Customer Category Mix Analysis)

RULE_ID - Rule Id .. IF (antecedent) THEN (consequent) END

RULE_ANTECEDENT_ITEMS - List of Categories making up the IF part of the Category Basket (1 upto max as specified in settings while building the model)

RULE_CONSEQUENT_ITEMS - List of Categories making up the THEN part of the Category Basket (usually 1 category)

RULE_SUPPORT - Support (number of cases in the input dataset which pertain to the current basket)

RULE_CONFIDENCE - Confidence (Probability of the THEN part of the Rule coming true based on input dataset)

RULE_DISPLAY_ORDER - Default or recommended display order of the rules. Critical for DT models. Not critical for APASS models.

RULE_LENGTH - Number of Categories in the antecedent (IF) part of the Rule

BSKT_CTGRY_COUNT - Number of categories in the Category Basket (includes antecedent and consequent)

BSKT_ALL_SLS_VAL - Sales Value (total) for all categories in the Category Basket

BSKT_ALL_SLS_UNITS - Sales Units (total) for all categories in the Category Basket

BSKT_ALL_TRX_COUNT - Transaction Count for all categories in the Category Basket

BSKT_ALL_CUST_COUNT - Customer Count for all categories in the Category Basket

Customer Loyalty Analysis Model

This model addresses the business problem of discovering the impact of customer characteristics on customers' loyalty to a store.

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

Target Variable for the Customer Loyalty Analysis Model

The rules are designed to be generated monthly. Therefore, one ABN model and one DT model is created every month across all stores combined using the Customer Loyalty variable as the target.

Source Variables for the Customer Loyalty Analysis Model

The following attributes of customers are identified from the data warehouse tables as source variables for the models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

Case Id Alt (PK)

Year

Month

Customer Number

Registered as Gift Receiver

Registered as Gift Giver

Customer Occasion Type This Month

Campaign This Month

Membership Account Type Code (None if the customer does not have any account; the last used account if the customer has multiple accounts)

Life-To-Date Points

Available Points

Customer Account Type (None if the customer does not have any account; the last used account if the customer has multiple accounts)

Customer Identity Required Indicator

Customer Identity Type Name

Customer Group Code (None if the customer does not belong to any group)

Age

Marital Status

Gender

Income

Race

Education

Profession

Household Size

Years of Residence

Demography Group Name

Customer County or District

Customer City

Customer State

Customer Country

Customer World Region

Loyalty Categories for the Customer Loyalty Analysis Model

The RFMP algorithms provide functionality to group customers into quartiles, deciles, and quintiles. Each customer falls into one of the following five loyalty categories based on the RFMP quintile he or she belongs to in a particular month:

Group A (RFMP Quintile 5)

Group B (RFMP Quintile 4)

Group C (RFMP Quintile 3)

Group D (RFMP Quintile 2)

Group E (RFMP Quintile 1)

Note:

The definition of each of the loyalty types as well as the number of loyalty types may vary with each implementation.

Columns Included in the Target Views for the Customer Loyalty Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Rule ID (PK)

Performance Measure

Measure Value

Associate Profile

Prediction Count

Record Count

Support

Confidence

Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and is also displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Performance Measure

Attribute Name (PK)

Attribute Type

Frequent Shopper Category Mix Analysis Model

This model addresses the business problem of finding product categories that are frequently bought by frequent shoppers. Finding these product categories can help in optimizing Merchandising and Category Mix options that relate to Store Layout, Display and Frontage, Promotional Campaigns, Co-branding, and others.

The model is used to understand the Categories purchased by a Frequent Shopper in a typical transaction in terms of the components like the Categories in the Basket, Target Category in a Basket and additional information like Basket Significance (Sales Value), Target Category Significance (Sales Value) which are generated from regular Customer Transactional data.

Using Oracle Data Mining, the KPIs are modeled with the APRIORI algorithm utilised by the Association Rules model. The model type used for Association Rules with Apriori Algorithm is APASS. This model type is an example of Unclassified Learning since the Categories (or Target Category) which constitute the Category Basket are not inferred or guided (as part of data preparation) but are generated by the model itself.

IF
CUSTOMER HAS BOUGHT 'FILM ACCESSORIES'
AND
CUSTOMER HAS BOUGHT 'MAGAZINE'
AND
CUSTOMER HAS BOUGHT 'PET'
THEN
CUSTOMER IS LIKELY TO BUY 'BEAUTY' [Support – 48%, Confidence - 82%]

Category Basket Significance of ('FILM ACCESSORIES', 'MAGAZINES', PET', 'BEAUTY') is 62% of Sales Value => The Sales from the 4 categories in Category Basket ('FILM ACCESSORIES', 'MAGAZINES', PET', 'BEAUTY') account for 62% of the Total Sales across all categories in that particular store.

The Category Basket Significance (Sales Value) KPI allows us to filter out Rules which may be insignificant from a Basket Sales Value perspective.

Target Category Significance of (BEAUTY') is 70% of the Basket Sales Value => The Sales from the Target Category ('BEAUTY') account for 70% of the Total Sales from the Category Basket ('FILM ACCESSORIES', 'MAGAZINES', PET', 'BEAUTY') in that particular store.

The Target Category Significance (Sales Value) KPI allows us to filter out Rules determining insignificant Customer Purchases (insignificant Target Category). In other words, it helps us to extract Rules which relate to singificant Customer Purchases, where the Target Category is significant within the Basket of Categories (from a Sales Value perspective). Identifying this information can be useful from a campaign/promotion/upsell perspective.

What the Discovered Rules for the Frequent Shopper Category Mix Analysis Model Explain

The discovered rules help explain purchase patterns of frequent shoppers, for example:

The chances of a frequent shopper buying a BEAUTY product increases from 30% to 70% if he or she purchases a FILM ACCESSORIES product. It further increases to 74% if the frequent shopper buys a MAGAZINES product and to 82% if he or she also buys a PET product.

The chances of a customer buying a BEAUTY product is 82% if he or she has purchased a product from categories 'FILM ACCESSORIES', 'MAGAZINES' and 'PET'. This Rule is about 62% significant in terms of Basket Value (Sales Value of all 4 categories in Basket) and The Target Category (BEAUTY) significance is about 70% of the Basket.

BEAUTY products are very rarely (about 3% of the time) sold with BAKERY products.

The significance of a rule can be measured in terms of support and confidence and a host of additional supporting measures like Basket Significance (Value), Target Category Signifcance (Value), Basket Significance (Customers), Target Category Signifcance (Customers), Basket Significance (Transactions), Target Category Signifcance (Transactions), and so on.

What the Frequent Shopper Category Mix Analysis Model Mines

This model mines the monthly purchases of individual frequent shoppers and discovers rules about the categories that are frequently bought in groups by frequent shoppers.

Target Variable for the Frequent Shopper Category Mix Analysis Model

The purchase patterns are designed to be generated monthly for each individual store. Therefore, APASS models are created every month for each store.

Source Variables for the Frequent Shopper Category Mix Analysis Model

The following item attributes are the source variables:

Case Id Alt (PK)

Store ID

Year

Month

ID

Name

Value

Note:

If the mining must be performed at multiple levels, such as category, subcategory, item, and others, there may be multiple source tables.

Columns Included in the Target Views of the Frequent Shopper Category Mix Analysis Model Report

The mined patterns and rules are visible through the target view FS_CATEGORY_MIX_APASS_RULES with the following columns and can be displayed in an OBIEE report:

STORE_ID - Store ID

YEAR - Year

MONTH - Month

MODEL_NAME - Model Name (CCM_MDL_APASS_<Store_ID>)

MODEL_TYPE - Model Type (APASS)

MODEL_TYPE_DESC - Model Description (Apriori Association)

ANALYSIS_NAME - Analysis Name (CUST_CATEGORY_MIX)

ANALYSIS_DESC - Analysis Description (Customer Category Mix Analysis)

RULE_ID - Rule Id .. IF (antecedent) THEN (consequent) END

RULE_ANTECEDENT_ITEMS - List of Categories making up the IF part of the Category Basket (1 upto max as specified in settings while building the model)

RULE_CONSEQUENT_ITEMS - List of Categories making up the THEN part of the Category Basket (usually 1 category)

RULE_SUPPORT - Support (number of cases in the input dataset which pertain to the current basket)

RULE_CONFIDENCE - Confidence (Probability of the THEN part of the Rule coming true based on input dataset)

RULE_DISPLAY_ORDER - Default or recommended display order of the rules. Critical for DT models. Not critical for APASS models.

RULE_LENGTH - Number of Categories in the antecedent (IF) part of the Rule

BSKT_CTGRY_COUNT - Number of categories in the Category Basket (includes antecedent and consequent)

BSKT_ALL_SLS_VAL - Sales Value (total) for all categories in the Category Basket

BSKT_ALL_SLS_UNITS - Sales Units (total) for all categories in the Category Basket

BSKT_ALL_TRX_COUNT - Transaction Count for all categories in the Category Basket

BSKT_ALL_CUST_COUNT - Customer Count for all categories in the Category Basket

Item Basket Analysis Model

This model addresses the business problem of identifying the extent to which item (product) characteristics influence the items' sales KPIs.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

IF
FOOD STAMP is NOT allowed
AND
ENVIRONMENT TYPE CODE is (REFRIGERATED)
AND
FREQUENT_SHOPPER_POINTS (800 - 999)
THEN
AVERAGE BASKET VALUE is (LOWEST)

What the Discovered Rules for the Item Basket Analysis Model Explain

The discovered rules draw the profile of items that have the extreme values of the target KPI. For example, the Examples of Desired Rules for this model discover the profiles of items showing extreme average basket values.

What the Item Basket Analysis Model Mines

This model identifies which key attributes of an item influence the number of baskets sold, average basket value, and size in a particular store. This model mines the various attributes of items. It takes the binned variables one at a time for Total Basket Count, Average Basket Value, and Average Basket Size as the target variable of an ABN model and DT model with a single feature and discovers rules described in terms of item characteristics.

Target Variable for the Item Basket Analysis Model

The rules are designed to be generated monthly for each individual store. Therefore, nine ABN and nine DT models are created every month for each store using the following variables as targets in this order:

Total Basket Count Quartile (TBCQR)

Total Basket Count Quintile (TBCQN)

Total Basket Count Decile (TBCDE)

Average Basket Value Quartile (ABVQR)

Average Basket Value Quintile (ABVQN)

Average Basket Value Decile (ABVDE)

Average Basket Size Quartile (ABSQR)

Average Basket Size Quintile (ABSQN)

Average Basket Size Decile (ABSDE)

Source Variables for the Item Basket Analysis Model

The following item attributes are identified from the data warehouse tables as source variables for the ABN and DT models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

Case Id Alt (PK)

Store ID

Year

Month

Item ID

Brand Name

Category Name

Department Name

Customer Pickup Type Code

Discount Indicator

Environment Type Code

Hazardous Material Type Code

Perishable Indicator

Kit Set Code

Order Collection Code

Price Audit Flag

Sale Weight or Unit Count Code

Security Required Type Code

Sell Unit Landed Cost Amount

Sell Unit Last Received Base Cost Amount

Sell Unit Last Received Net Cost Amount

Item Sale Unit Price Amount

Shrink Flag

Substitute Identified Indicator

Swell Flag

Item Usage Code

Vendor Item Number

Max Shipping Capability

Min Order Quantity

Sale Unit per Packet Unit Count

Shipping Capability Units

Store Order Allowed Flag

Store Receipt Allowed Flag

Style Description

Terms Code

Vendor Number

Vendor Class Code

Buy Status Indicator

Credit Limit Offered

Inform Government Indicator

Vendor Number of Years in Business

Pay Status Indicator

Competitor Retail Item Name

Competitor Name

Competitor Item Local Advertising Flag

Competitor Item On Promotion Flag

Competitor Item Promotion Store Coupon Indicator

Competitor Sale Unit Price Amount

Allow Coupon Multiply Indicator

Allow Food Stamp Indicator

Coupon Restricted Indicator

Electronic Coupon Flag

Employee Discount Allowed Flag

Frequent Shopper Points

Frequent Shopper Points Eligibility Indicator

Give Away Flag

Item Tender Restriction Group Code

Manufacturer

Manufacturer Family Code

Maximum Sale Unit Count

Price Entry Required Flag

Prohibit Repeat Key Flag

Prohibit Return Flag

Selling Status Code

Visual Verify Price Flag

Weight Entry Required Flag

Columns Included in the Target Views for the Item Basket Analysis Model Report

The mined patterns and rules are visible through in a target view with the following columns and can be displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Store ID (PK)

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Rule ID (PK)

Performance Measure

Measure Value

Associate Profile

Prediction Count

Record Count

Support

Confidence

Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and is also displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Performance Measure

Attribute Name (PK)

Attribute Type

Item POS Loss Analysis Model

This model addresses the business problem of building a profile of item (product) characteristics regarding POS losses.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

What the Discovered Rules for the Item POS Loss Analysis Model Explain

What the Item POS Loss Analysis Model Mines

This model mines the POS transactions along with the item attributes to identify their impact on Total Shrink Count, Total Shrink Amount, Shrink as a percentage of Sales, Total Theft Count, Total Theft Amount, and Theft as a percentage of Sales.

Target Variables for the Item POS Loss Analysis Model

The rules are designed to be generated monthly for each individual store. Therefore, eighteen ABN and eighteen DT models are created every month for each of the stores using the following variables as targets in this order:

Total Shrink Count Quartile (TSCQR)

Total Shrink Count Quintile (TSCQN)

Total Shrink Count Decile (TSCDE)

Total Shrink Amount Quartile (TSAQR)

Total Shrink Amount Quintile (TSAQN)

Total Shrink Amount Decile (TSADE)

Shrink as a percentage of Sales Quartile (STSQR)

Shrink as a percentage of Sales Quintile (STSQN)

Shrink as a percentage of Sales Decile (STSDE)

Total Theft Count Quartile (TTCQR)

Total Theft Count Quintile (TTCQN)

Total Theft Count Decile (TTCDE)

Total Theft Amount Quartile (TTAQR)

Total Theft Amount Quintile (TTAQN)

Total Theft Amount Decile (TTADE)

Theft as a percentage of Sales Quartile (TTSQR)

Theft as a percentage of Sales Quintile (TTSQN)

Theft as a percentage of Sales Decile (TTSDE)

Source Variables for the Item POS Loss Analysis Model

The following attributes of POS and items are identified from the data warehouse tables as source variables for the models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

Case Id Alt (PK)

Store

Year

Month

Item ID

Brand Name

Category Name

Department Name

Customer Pickup Type Code

Discount Indicator

Hazardous Material Type Code

Perishable Indicator

Kit Set Code

Order Collection Code

Price Audit Flag

Sale Weight or Unit Count Code

Security Required Type Code

Sell Unit Landed Cost Amount

Sell Unit Last Received Base Cost Amount

Sell Unit Last Received Net Cost Amount

Item Sale Unit Price Amount

Shrink Flag

Substitute Identified Indicator

Swell Flag

Item Usage Code

Vendor Item Number

Max Shipping Capability

Min Order Quantity

Sale Unit per Packet Unit Count

Shipping Capability Units

Store Order Allowed Flag

Store Receipt Allowed Flag

Style Description

Terms Code

Vendor Number

Vendor Class Code

Buy Status Indicator

Credit Limit Offered

Inform Government Indicator

Vendor Number of Years in Business

Pay Status Indicator

Competitor Retail Item Name

Competitor Name

Competitor Item Local Advertising Flag

Competitor Item On Promotion Flag

Competitor Item Promotion Store Coupon Indicator

Competitor Sale Unit Price Amount

Allow Coupon Multiply Indicator

Allow Food Stamp Indicator

Coupon Restricted Indicator

Electronic Coupon Flag

Employee Discount Allowed Flag

Frequent Shopper Points

Frequent Shopper Points Eligibility Indicator

Give Away Flag

Item Tender Restriction Group Code

Manufacturer

Manufacturer Family Code

Maximum Sale Unit Count

Price Entry Required Flag

Prohibit Repeat Key Flag

Prohibit Return Flag

Selling Status Code

Visual Verify Price Flag

Weight Entry Required Flag

Retail Transaction Attributes

Total Number of Retail Transactions For Item

Total Amount of Retail Transactions For Item

Average Amount Per Retail Transaction For Item

Number of Distinct Currency Used For Item

Total Units Sold For Item

Average Units Sold Per Retail Transaction For Item

Total Idle Interval For Item (This attribute is the sum of idle intervals of all transactions that contain this Item)

Average Idle Interval Per Retail Transaction For Item

Total Ring Interval For Item (This attribute is the sum of ring intervals of all transactions that contain this item)

Average Ring Interval Per Retail Transaction For Item

Total Tender Interval For Item (This attribute is the sum of tender intervals of all transactions that contain this item)

Average Tender Interval Per Retail Transaction For Item

Total Lock Interval For Item (This attribute is the sum of lock intervals before or after all transactions that contain this item)

Average Lock Interval Per Retail Transaction For Item

Total Line Items Scanned For Item (This attribute is the total number of times this item is scanned)

Average Line Items Scanned Per Units Sold For Item

Total Line Items Keyed For Item (This attribute is the total number of times this item is keyed)

Average Line Items Keyed Per Units Sold

Total Key Department Count For Item (This attribute is the total number of times this item is keyed by the department)

Average Key Department Count Per Units Sold

Total Service Charge For Item

Average Service Charge Per Retail Transaction For Item

Total Tax Amount For Item

Average Tax Amount Per Retail Transaction For Item

Total Number of Voided Transactions For Item

Average Number of Voided Transactions Per Retail Transaction For Item

Total Amount of Voided Transactions For Item

Average Amount of Voided Transactions Per Retail Transaction For Item

Average Amount of Voided Transaction as Percentage of Total Retail Transaction Amount For Item

Total Number of Discount Line Items For Item

Average Number of Discount Line Items Per Retail Transaction For Item

Total Amount of Discount Line Items For Item

Average Amount of Discount Line Items Per Retail Transaction For Item

Average Amount of Discount Line Items as Percentage of Total Retail Transaction Amount For Item

Total Number of Return Line Items For Item

Average Number of Return Line Items Per Retail Transaction For Item

Total Amount of Return Line Items For Item

Average Amount of Return Line Items Per Retail Transaction For Item

Average Amount of Return Line Items as Percentage of Total Retail Transaction Amount For Item

Total Number of Miscellaneous Fee Line Items For Item

Average Number of Miscellaneous Fee Line Items Per Retail Transaction For Item

Total Amount of Miscellaneous Fee Line Items For Item

Average Amount of Miscellaneous Fee Line Items Per Retail Transaction For Item

Average Amount of Miscellaneous Fee Line Items as Percentage of Total Retail Transaction Amount For Item

Total Number of Promotional Line Items For Item

Average Number of Promotional Line Items Per Retail Transaction For Item

Total Amount of Promotional Line Items For Item

Average Amount of Promotional Line Items Per Retail Transaction For Item

Average Amount of Promotional Line Items as Percentage of Total Retail Transaction Amount For Item

Total Number of Deposit Redemption Line Items For Item

Average Number of Deposit Redemption Line Items Per Retail Transaction For Item

Total Amount of Deposit Redemption Line Items For Item

Average Amount of Deposit Redemption Line Items Per Retail Transaction For Item

Average Amount of Deposit Redemption Line Items as Percentage of Total Retail Transaction Amount For Item

Control Transaction Attributes

Total Tax Exempt Transaction Count For Item

Average Tax Exempt Transaction Count Per Retail Transaction For Item

Tax Exempt Total Amount For Item

Average Tax Exempt Amount Per Retail Transaction For Item

Tax Exempt Total Amount as a Percentage of Total Retail Transaction Amount For

Item

Total Number of Store Coupons For Item

Average Number of Store Coupons Per Retail Transaction For Item

Average Number of Store Coupons Per Retail Transaction For Item

Total Amount of Store Coupons For Item

Average Amount of Store Coupons Per Retail Transaction For Item

Average Amount of Store Coupons as Percentage of Total Retail Transaction

Amount For Item

Total Markdown Count For Item

Average Markdown Count per Retail Transaction For Item

Markdown Total Amount For Item

Average Markdown Amount Per Retail Transaction For Item

Average Markdown Amount as a Percentage of Total Retail Transaction Amount

For Item

Total Employee Discount Count For Item

Average Employee Discount Per Retail Transaction For Item

Total Employee Discount Amount For Item

Average Employee Discount Amount Per Retail Transaction For Item

Average Employee Discount Amount as a Percentage of Retail Transaction

Amount For Item

Total Weighed Line Item Count For Item

Average Weighed Line Item Count Per Retail Transaction For Item

Total Weighed Line Item Amount For Item

Average Weighed Line Item Amount Per Retail Transaction For Item

Average Weighed Line Item Amount as a Percentage of Total Retail Transaction

Amount For Item

Total Layaway Payments Collected Count For Item

Average Layaway Payments Collected Count Per Retail Transaction For Item

Total Layaway Payments Collected Amount For Item

Average Layaway Payments Collected Amount Per Retail Transaction For Item

Average Layaway Payments Collected Amount as a Percentage of Total Retail

Transaction Amount For Item

Total Container Deposit Count For Item

Average Container Deposit Count Per Retail Transaction For Item

Total Container Deposit Amount For Item

Average Container Deposit Amount Per Retail Transaction For Item

Average Container Deposit Amount as a Percentage of Total Retail Transaction

Amount For Item

Total Redeemed Container Deposit Count For Item

Average Redeemed Container Deposit Count Per Retail Transaction For Item

Total Redeemed Container Deposit Amount For Item

Average Redeemed Container Deposit Amount Per Retail Transaction For Item

Average Redeemed Container Deposit Amount as a Percentage of Total Retail

Transaction Amount For Item

Total Cash Tender Count For Item

Average Cash Tender Count Per Retail Transaction For Item

Total Cash Tender Amount For Item

Average Cash Tender Amount Per Retail Transaction For Item

Average Cash Tender Amount as a Percentage of Total Retail Transaction Amount

For Item

Total Check Tender Count For Item

Average Check Tender Count Per Retail Transaction For Item

Total Check Tender Amount For Item

Average Check Tender Amount Per Retail Transaction For Item

Average Check Tender Amount as a Percentage of Total Retail Transaction

Amount For Item

Total Credit Card Tender Count For Item

Average Credit Card Tender Count Per Retail Transaction For Item

Total Credit Card Tender Amount For Item

Average Credit Card Tender Amount Per Retail Transaction For Item

Average Credit Card Tender Amount as a Percentage of Total Retail Transaction

Amount For Item

Total Debit Card Tender Count For Item

Average Debit Card Tender Count Per Retail Transaction For Item

Total Debit Card Tender Amount For Item

Average Debit Card Tender Amount Per Retail Transaction For Item

Average Debit Card Tender Amount as a Percentage of Total Retail Transaction

Amount For Item

Total Customer Account Tender Count For Item

Average Customer Account Tender Count Per Retail Transaction For Item

Total Customer Account Tender Amount For Item

Average Customer Account Tender Amount Per Retail Transaction

Average Customer Account Tender Amount as a Percentage of Total Retail

Transaction Amount

Total Gift Certificate Tender Count For Item

Average Gift Certificate Tender Count Per Retail Transaction For Item

Total Gift Certificate Tender Amount For Item

Average Gift Certificate Tender Amount Per Retail Transaction For Item

Amount For Item

Total Coupon Tender Count For Item

Average Coupon Tender Count Per Retail Transaction For Item

Total Coupon Tender Amount For Item

Average Coupon Tender Amount Per Retail Transaction For Item

Average Coupon Tender Amount as a Percentage of Total Retail Transaction

Amount For Item

Columns Included in the Target Views of the Item POS Loss Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Store ID (PK)

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Rule ID (PK)

Performance Measure

Measure Value

Associate Profile

Prediction Count

Record Count

Support

Confidence

Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and is also displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Performance Measure

Attribute Name (PK)

Attribute Type

POS Flow Analysis Model

This model addresses the business problem of detecting patterns in the flow of items, transactions, and amount across individual points of sale during different time periods.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

Examples of Desired Rules for the POS Flow Analysis Model

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
NUMBER_OF_HOUSEHOLDS IS<3000
AND
STORE LOCATION TYPE IS 'Free Standing'
AND
PER CAPITA INCOME IS<3000
THEN
Total Return Items Count Quartile IS THE LOWEST

Desired Rules Example 2

IF
NUMBER_OF_HOUSEHOLDS IS>5000
AND
AVERAGE_DRIVE_TIME_MIN IS<15
AND
(STORE_USAGE IS 'Store Within a Store' OR STORE_USAGE IS 'Department')
AND
STORE_OPEN_HOURS IS>=16
THEN
Total Sale Transactions Count Quartile IS THE HIGHEST

What the Discovered Rules for the POS Flow Analysis Model Explain

The discovered rules describe the influence of trade area demographic characteristics and store characteristics on the amount of POS traffic at individual workstations over different periods throughout the day.

What the POS Flow Analysis Model Mines

This model mines the various attributes of Store and Location. It takes the binned variables one at a time for Transaction Type, Transaction Count, and Transaction Amount (Sales) as the target variable of an ABN model and DT model with a single feature and discovers rules described in terms of Store, Location, and State Demographic attributes.

Target Variables for the POS Flow Analysis Model

The rules are generated from the historical data across all time periods and store workstations. A total of twenty-seven ABN and twenty-seven DT models with a single feature are created using the following variables as targets in this order:

Total Sale Transactions Count Quartile (TSTCQR)

Total Sale Transactions Count Quintile (TSTCQN)

Total Sale Transactions Count Decile (TSTCDE)

Total Return Transactions Count Quartile (TRTCQR)

Total Return Transactions Count Quintile (TRTCQN)

Total Return Transactions Count Decile (TRTCDE)

Total Void Transactions Count Quartile (TVTCQR)

Total Void Transactions Count Quintile (TVTCQN)

Total Void Transactions Count Decile (TVTCDE)

Total Sale Items Count Quartile (TSICQR)

Total Sale Items Count Quintile (TSICQN)

Total Sale Items Count Decile (TSICDE)

Total Return Items Count Quartile (TRICQR)

Total Return Items Count Quintile (TRICQN)

Total Return Items Count Decile (TRICDE)

Total Void Items Count Quartile (TVICQR)

Total Void Items Count Quintile (TVICQN)

Total Void Items Count Decile (TVICDE)

Total Sale Amount Quartile (TSAQR)

Total Sale Amount Quintile (TSAQN)

Total Sale Amount Decile (TSADE)

Total Return Amount Quartile (TRAQR)

Total Return Amount Quintile (TRAQN)

Total Return Amount Decile (TRADE)

Total Void Amount Quartile (TVAQR)

Total Void Amount Quintile (TVAQN)

Total Void Amount Decile (TVADE)

Source Variables for the POS Flow Analysis Model

The following attributes of store workstation and time periods are identified from the data warehouse tables as source variables for the ABN and DT models (note that a few of these variables are unique identifiers and are treated as supplementary variables):

Columns Included in the Target Views of the POS Flow Analysis Model Report

The mined patterns and rules are visible through a target view with the following columns and can be displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Model Type

Model Type Desc

Model Name (PK)

Rule ID (PK)

Performance Measure

Measure Value

Associate Profile

Prediction Count

Record Count

Support

Confidence

Rule Display Order

A new target view representing the Model Signature outlining the attribute structure of the model (built using an ABN or DT algorithm) is also available.

The Model Signature Target View contains the following columns and is also displayed in an OBIEE report:

Analysis Name (PK)

Analysis Desc

Year (PK)

Month (PK)

Model Type

Model Type Desc

Model Name (PK)

Performance Measure

Attribute Name (PK)

Attribute Type

Store Loss Analysis Model

This model addresses the business problem of building a profile of organization (store) characteristics regarding shrinkage.

The KPIs are converted into categorical variables using standard database binning operations. The categorical variables are modeled as a classification model to identify or predict the impact of various independent variables (attributes) on the dependent target variable (KPI - categorical).

Using Oracle Data Mining, the KPIs are modeled using two popular Classification Algorithms - Adaptive Bayes Network (ABN) and Decision Tree (DT).

Adaptive Bayes Network (ABN) algorithm is used to build a fast scalable model with scalable rules whereas the Decision Tree (DT) algorithm is used when explicit rules explaining predictions are needed.

Examples of Desired Rules for the Store Loss Analysis Model

This section provides examples of the desired rules.

Note:

Discovery rules are parsed to make them easier to read (replacing the Column or Attribute names with descriptions), removing keywords or phrases like "isIn" with =, "AND" with "and", and so on.

Desired Rules Example 1

IF
STORE IS NEW
and
NUMBER OF WINDOW DISPLAYS IS (4 - 5)
and
STORE DEPARTMENT IS (RETURN)
THEN
STORE THEFT AMOUNT IS THE HIGHEST

Desired Rules Example 2

IF
STORE IS NEW
and
NUMBER OF WINDOW DISPLAYS IS (4 - 5)
and
STORE DEPARTMENT IS (GIFT)
THEN
STORE THEFT AMOUNT IS THE LOWEST

What the Discovered Rules for the Store Loss Analysis ModelExplain

The discovered rules describe correlations between shrinkage and store characteristics.

What the Store Loss Analysis Model Mines

This analysis identifies the extent to which key store characteristics influence shrinkage and theft. This model mines the various attributes of stores. It takes the binned variables one at a time for Shrink and Thefts as the target variable of an ABN model and DT model with a single feature and discovers rules described in terms of store attributes.

Target Variables for the Store Loss Analysis Model

The rules are designed to be generated monthly. Therefore, eighteen ABN and eighteen DT models are created every month across all stores using the following variables as targets in this order:

Total Shrink Count Quartile (TSCQR)

Total Shrink Count Quintile (TSCQN)

Total Shrink Count Decile (TSCDE)

Total Shrink Amount Quartile (TSAQR)

Total Shrink Amount Quintile (TSAQN)

Total Shrink Amount Decile (TSADE)

Shrink as a percentage of Sales Quartile (STSQR)

Shrink as a percentage of Sales Quintile (STSQN)

Shrink as a percentage of Sales Decile (STSDE)

Total Theft Count Quartile (TTCQR)

Total Theft Count Quintile (TTCQN)

Total Theft Count Decile (TTCDE)

Total Theft Amount Quartile (TTAQR)

Total Theft Amount Quintile (TTAQN)

Total Theft Amount Decile (TTADE)

Theft as a percentage of Sales Quartile (TTSQR)

Theft as a percentage of Sales Quintile (TTSQN)

Theft as a percentage of Sales Decile (TTSDE)

Source Variables for the Store Loss Analysis Model

The following attributes of associates are identified from the data warehouse tables as source variables for the ABN and DT models (note that a few of these variables are unique identifiers and are treated as supplementary variables):