Math533 Part a

PART A- Exploratory Data Analysis
Introduction & Overview AJ Davis is a department store chain, which has many credit customers and wants to find out more information about these customers. The total sample set of 50 credit customers is selected with data collected. The below data was provided in order to perform the analysis. 1. Location:

a. Urban
b. Suburban
c. Rural
2. Income 3. Household Size (number of people living in the household) 4. Years (the number of years that the customer has lived in the current location) 5. Credit Balance (the customer’s current credit card balance on the store's credit card)

Individual Variables Five individual variables were provided for review: Location, Income, Household Size, Years, and Credit Balance. Below is a statistical analysis summarizing the key points referencing Location, Income and Credit Balance.

Variable: Location The location of AJ Davis’ customers is distributed between three classes of urban, suburban and rural areas. Of the total number of customer locations in the sample set of 5; 13 are located in rural, 15 in suburban and 22 in urban locations. The pie chart shows that just less than half of all AJ Davis’ customers live in urban (44%) areas, yet customers that live in rural (26%) and suburban (30%) areas are relatively evenly distributed. The rural and suburban areas combine to compromise 56% of AJ Davis’ credit customer base in which they are interested in. This should be useful information to AJ Davis as about half of their credit customers are from urban locations.

Variable: Income The range of customer’s income is $49,000, with the highest income at $74,000 and the lowest at $25,000. The mean (average) income of a customer is $46,020. The median income is $44,500 (half way point between the 50 total incomes, 25 incomes below and 25 incomes above). The value that appears most frequent (mode) is $54,000 and $57,000. This information is beneficial to AJ Davis in that it gives them a snapshot of customer income information so that they will be better able to assess their customer’s ability to repay their debt, as well as help them gage if credit limits could possibly be increased.

This boxplot displays this information in a graphical format. The box represents the lower quartile of $33,000 (25% of incomes are less than this value), the median of $44,500, and the upper quartile of $57,250 (25% of incomes are greater than this value).

Variable: Credit Balance

The average (mean) credit balance of AJ Davis’ customer is $4,153, the largest credit balance is $5,861 and smallest at $2,047, resulting in a range of $3,814. The standard deviation is $932. This represents that in the aggregated comparison of credit balances of AJ Davis customers (in the sample set) a relatively small variation exists from the mean of $4,153. The median credit balance is $4,273; the median often provides a clearer picture of the overall data set examined. Yet in this case being the median and mean resulted in relatively close figures (difference of only $120) both are representative of an “average” credit balance.

This boxplot displays this information collected in a graphical format. The box represents the lower quartile of $3292 (25% of balances are less than this value), the median of $4273 and the upper quartile of $4931 (25% of balances are greater than this value). As shown a boxplot is a valuable tool in analyzing this data as it gives a good graphical image of the concentration of the data, which is the box. They also show how far from most of the data the extreme values are, which is shown by the top and the bottom of the vertical line.

Relationships
Below are three sets of paired variables from the given data. * Location and Credit Balance
* Credit Balance and Size

* Income and Credit Balance

Location and Credit Balance

A look at the mean credit balances by location shows that the highest...

You May Also Find These Documents Helpful

...|
Course Project: AJ DAVIS DEPARTMENT STORES |
|
Part A |
|
|
|
The following report presents the details of the statistical analysis of the data collected from a sample of fifty credit customers of the department AJ Davis department store chain. The data collected was based on five variables, which were location, income, size, and years at current location and credit balance. The first variable analyzed was Location. The location data is a categorical variable; this was further broken down into three subcategories. The subcategories of location are: Urban, Suburban and Rural. A frequency distribution and pie chart for the subcategories is shown below:
Location Frequency Distribution: |
Location | Frequency |
Urban | 21 |
Suburban | 15 |
Rural | 14 |
The above pie chart and frequency distribution, illustrate that the largest number of customers are those in the rural category (42%), followed by those in the suburban category (30%). There are only 28% of the customers that fall into the urban category.
Size was the next variable to be analyzed; this is a quantitative variable. Central tendency, variation and a bar graph were calculated for the Size variable. After analyzing the data for Size, it was determined that the mean household size of the customers is 3.42. The median of the data collected is 3, the mode is 2. The standard deviation is rounded to the nearest one hundredth and is 1.74. Upon reviewing the...

...﻿Chanelle Henderson
Math533 Applied Managerial Statistics
Prof. Lisabeth Goble
Week 6: Project B: Testing Hypothesis Testing and Confidence Intervals
A. The average annual income was less than $50,000
Null Hypothesis: The average annual income was greater than or equal to $50,000H₀: µ ≥ 50,000
Alternate Hypothesis: The average annual income was less than $50,000. (Claim)Ha: µ <50,000
Analysis Plan: Significance Level, α=0.05.Since the sample size, n > 30 I will use z-test for mean to test the given
hypothesis.As the alternative hypothesis is Ha: µ <50,000, the given test is a one-sided z-test.
Critical Value and Decision Rule:The critical value for significance level, α=0.05 for a left -sided z-test = -1.645.
Decision Rule: Reject H₀, if z-test> -1.645
Test Statistic - Minitab
One-Sample Z: Income ($1000)Test of mu = 50 vs< 50
The assumed standard deviation = 14.64
95% Upper
N Mean SE Mean Bound Z P
50 43.74 2.07 47.15 -3.02 0.001
Interpretation of Results and Conclusion: Since the P-value (0.001) is smaller than the significance level (0.05), we reject the null hypothesis. The p-value implies the probability of rejecting a true null hypothesis and is the area to the left of the z test. The significance level of 0.05, there is enough evidence to support the claim that the average annual income was less than $50,000
Confidence Interval - Minitab
One-Sample Z
The assumed standard deviation = 14.64
N Mean SE Mean 95% CI
50...

... The average (mean) annual income was less than $50,000,
One-Sample Z: Income ($1000)
Test of mu = 50 vs < 50
The assumed standard deviation = 14.64
95% Upper
Variable N Mean StDev SE Mean Bound Z P
Income ($1000) 50 43.74 14.64 2.07 47.15 -3.02 0.001
α 0.05= -1.645
H0 μ = 50,000
Ha μ < 50,000
The hypothesis test claims that the average annual income was less than $50,000. H0 claims equal to $50,000 and the alternative hypothesis claims less than $50,000. The significance level is α 0.05= -1.645. According to Course Project data, when I generate the data of income in Minitab, I found the standard deviation, which is 14.64. Then next step is calculating the z-value by Minitab, which is -3.02.
Consequently;
z-value < α 0.05 -3.020.4
The hypothesis test claims that the true population proportion of customers who live in urban area is greater than %40. H0 claims equal to %40 and the alternative hypothesis claims less than %40. The significance level is α 0.05= 1.645. According to Course Project data, when I generate the data of income in Minitab, I calculated the number of trails and event respectively. The number of trails is 66 and events is 171. Then next step is calculating the proportion by Minitab, which is -0.37.
Consequently;
-0.37 significance level.
The average (mean) number of years lived in the current home is less than 13 years,...

...﻿
MATH533: Applied Managerial Statistics
PROJECT PART C: Regression and Correlation Analysis
Using MINITAB perform the regression and correlation analysis for the data on SALES (Y) and CALLS (X), by answering the following questions:
1. Generate a scatterplot for SALES vs. CALLS, including the graph of the "best fit" line.
Interpret.
After interpreting the scatter plot, it is evident that the slope of the ‘best fit’ line is positive, which indicates that sales amount varies directly with calls. As call increases, the sales amount increases as well.
2. Determine the equation of the "best fit" line, which describes the relationship between
SALES and CALLS.
The equation of the ‘best fit’ line or the regression equation is SALES(Y) = 9.638 + 0.2018 CALLS(X1)
3. Determine the coefficient of correlation. Interpret:
MINTAB Results:
Correlations: SALES(Y), CALLS(X1)
Pearson correlation of SALES(Y) and CALLS(X1) = 0.871
P-Value = 0.000
The coefficient of correlation is 0.871. The correlation coefficient is positive so this indicates a positive or direct relationship between the variables. The correlation coefficient is far from the P-Value of 0.000. This means that there is an extremely low chance that Sales and Calls results are wrong and we can be confident in interpretation.
4. Determine the coefficient of determination. Interpret.
MINTAB Results:
S = 2.05708 R-Sq = 75.9% R-Sq(adj) = 75.7%
The...

...﻿
Ottmas Richards
Project B
Professor Heard
Statistics 533
A manager at AJ Davis department store has speculated on certain customer statistics in hopes to get a better understanding of his target market, thus improving revenue though marketing and sales initiatives. His first assumption is that the average annual income of 50 randomly selected customers is less than $50,000. After calculating the data this assumption proved to be true, as we can say with 95% certainty that the income is less than $47,510. The next assumption is that more than 40% of his customers live in urban areas. After computing the data, we cannot be certain this is accurate. The data falls too close for me to be confident in this assumption; the 95% confidence interval is too wide. The third assumption is that the average number of years the customers have lived at their current home is less than 13 years. Again, this was an accurate assumption as we can say with a high degree of confidence than most customers have lived at their current location for less than 13 years as the mean was found to be 12.38 years. Finally, the manager assumed the average mean credit balance for suburban customers is greater than $4300. This was found to be an accurate assumption. The 95% lower confidence limit was found to be greater than $4300.
a) The average (mean) annual income was less than $50,000
Null Hypothesis: The average annual income was greater than or equal to $50,000
Ho:u = 50,000...

...﻿Project Part B: Hypothesis Testing and Confidence Intervals
a. The average (mean) annual income was less than $50,000
Null Hypothesis: The average annual income was greater than or equal to $50,000
H₀: µ > 50000
Alternate Hypothesis: The average annual income was less than $50,000.
Ha: µ > 50000
Analysis Plan: Significance Level, α=0.05.
Since the sample size, n > 30 I will use z-test for mean to test the given hypothesis.
As the alternative hypothesis is Ha: µ > 50000, the given test is a one-tailed z-test.
Critical Value and Decision Rule:
The critical value for significance level, α=0.05 for a lower-tailed z-test is given as-1.645.
Decision Rule: Reject H₀, if z – statistic, -1.645
Test Statistic - minitab
One-Sample Z: Income ($1000)
Test of mu = 50 vs < 50
The assumed standard deviation = 14.55
95% Upper
Variable N Mean StDev SE Mean Bound Z P
Income ($1000) 50 43.48 14.55 2.06 46.86 -3.17 0.001
Interpretation of Results and Conclusion:
Since the P-value (0.0001) is smaller than the significance level (0.05), we reject the null hypothesis. The p-value implies the probability of rejecting a true null hypothesis.
The significance level of 0.05, there is enough evidence to support the claim that the average annual income was less than $50,000.
Confidence Interval - minitab
One-Sample Z
The assumed standard deviation = 14.55
N Mean SE Mean 95% CI
50 43.48 2.06 (39.45, 47.51)
The 95% upper confidence limit is 47.51. Since, 50.00...

...authors clearly stated their purpose and research questions in the abstract. These were also reiterated in a concise and clear article summary on the first page. “The objectives of the study are to observe the overall work environment including infection prevention and control (IP&C) practices on the target surgical unit” (Backman et al, 2012, p. 1). The authors broke the study down further due to the recognition that there are several aspects that influence the overall work environment. These additional questions included analyzing the hospital’s policies and procedures that were related to IP & C. The authors also examined the potential barriers, as well as the bridges, that practitioners experienced in the hospital units. The last part of the study focused on the collection of anonymous data that was related to IP & C. These questions were clearly stated in several areas of the article in a concise format that the reader could appreciate (Backman et al, 2012, p. 1).
Literature Review
This was a significant weakness of the article. The authors did not offer a separate literature review section. The authors did discuss the background of the issue in the “Introduction” of the article. While this is often done in articles, it actually creates a weakness in the article. It is preferred that a separate literature review section exists within the article. For the purpose of this review, the information discussed in the introductory section will...