Похожие презентации

3
Where we are Variables that generate raw data occur at random. So that useful information can be obtained about these variables, as shown in the previous lecture, raw data can be organized by using frequency distributions. Furthermore, organized data can be presented by using various graphs, observed during the last lecture

5
Averages Statistical methods are used to organize and present data, and to summarize data. The most familiar of these methods is finding averages. For example, one may read that the average salary of top-managers is $5 mln and the salary of a Russian professor is less than $1,000

8
Average or Mean? Analyzing various texts we can summarize that the word average is used in general cases, in the discussion of medium size, and the word mean is used in special cases dealing with specific formulas and applications to find the average

10
Arithmetic Mean Statistical methods are used to organize and present data, and to summarize data. The most familiar of these methods is finding averages. For example, one may read that the average salary of top-managers is $5 mln and the salary of a Russian professor is less than $1,000

11
Some quotes from Batueva In the book American averages by Mike Feinsilber and William B. Meed, the authors state: "Average" when you stop to think of it is a funny concept. Although it describes all of us it describes none of us... While none of us wants to be average American, we all want to know about him and her

12
Some quotes from Batueva The authors go on to give examples of averages: The average American man is five feet, nine inches tall; the average woman is five feet, 3.6 inches. The average American is sick in bed seven days a year missing five days of work. On the average day, 24 million people receive animal bites

13
Some quotes from Batueva By his or her 7th birthday, the average American will have eaten 14 steers, 1050 chickens, 3.5 lambs, and 25.2 hogs. In the above examples, the word average is ambiguous, since there are several different methods used to obtain an average. Loosely stated, the average means the center of the distribution or the most typical case. Measures of central tendency are also called measures of average and include the mean, mode and median

14
Some quotes from Batueva The mean is also known as arithmetic mean (or arithmetic average) and is the sum of the values divided by the total number of values in the sample or of the size of population. Fоr discrete (ungrouped) data the simple mean should be used

15
Logical formula The calculation of any average should start with determining a logical formula. Before multiplying, dividing, or adding anything, it is necessary to make the initial ratio of the average, otherwise known as a logical formula

17
where A - the amount of the studied events in the population or sample: A is the summary absolute value; B – the size of population or sample: B is the number of units in the population or sample. IRA gives us the level of the studied events per unit of population or sample

18
Examples of IRA The average salary shows how much one employee earns. What do we take in the numerator and denominator of the IRR? A - amount of funds paid to all employees = wages & salaries fund; B - number of employees

19
Average salary The salary of the individual employee is the individual value. Wages & salaries fund is the summary value, and the average salary is an average

20
Examples of IRA The average price shows how much in average this good costs. What do we take in the numerator and denominator of the IRR? A – the turnover gained from the sale of goods; B - the total quantity of goods sold

21
Examples of IRA The average cost shows how much money was spent per unit of production. What do we take in the numerator and denominator of the IRR? A – the cost of production; B - the total quantity of goods produced = the quantity of output

22
Examples of IRA The average age shows the average number of years lived by the population under investigation; this indicator concerns not of necessarily animate objects - this may be the average age of cars, students, buildings, chickens, equipment. What do we take in the numerator and denominator of the IRR? A – the total number of years; B - the number of surveyed units

23
Examples of IRA Average lifespan: Life expectancy for people, service life, or average age of used equipment - shows the average number of years lived by investigated units, no matter living or inanimate objects they are. What do we take in the numerator and denominator of the IRR? A - total number of life (service) years; B - the number of surveyed units

24
Logical formula For a specific economic indicator we may form only one true logical formula

25
Types of averages Mathematicians proved that most averages we use, can be expressed in general terms, by the formula of average power (средней степенной)

26
The averages used in statistics relate to the class of power averages. The general formula of average power is as follows: _ where x k – average power of grade k; k – the exponent, which determines the type, or form of the average; х – the values of variants; n – the number of variants

31
Inequality concerning AM, GM, and HM Правило мажорантности A well known inequality concerning arithmetic, geometric, and harmonic means for any set of positive numbers is

32
It is easy to remember noting that the alphabetical order of the letters A, G, and H is preserved in the inequality. The higher exponent in the formula of average power, the greater the value of the average

33
Simple arithmetic mean SAM Simple arithmetic mean SAM is used when we have different variants without grouping. In the numerator, we find the amount of variants, in the denominator - the number of variants

34
Example 1.Productivity of 5 workers was 58, 50, 46, 44, 42 products per shift. Determine the average productivity of five workers. In this case, the solution is as follows:

35
Simple Arithmetic Mean SAM In this case, the simple arithmetic mean SAM was used and it could be calculated by the following equation: where n – the number of values in the sample or the size of population

42
Example 3 For data in an ungrouped frequency distribution, the mean can be found as shown in the next example Example 3. The scores for 25 students on a 5- point quiz are shown below. Find the mean (the average score)

44
Example 3 To find the average score it is necessary to calculate the total scores tor all students. For this, the multiplication between the score and the frequency To find the average score it is necessary to calculate the total scores tor all students. For this, the multiplication between the score and the frequency of each class should be made. Then, the sum of these multiplications must be found (see the last line in the table)

45
Example 3 Further, the total sum has to be divided by the number of students: The average score for 25 students on a 5-point quiz is 2.68 points

47
Weighted mean WAM In two last examples the weighted arithmetic mean was used and it could be calculated by the following formula: It is also used for data in grouped frequency distribution after transformation into ungrouped frequency distribution by dint of the midpoint.

48
Example 5 The following frequency distribution shows the amount of money (USD) 100 families spend for gasoline per month. Find the mean (the average amount of money)

49
Example 5 Classes of families by the amount of money Number of families, Midpoint Total

50
The midpoint The procedure of finding the mean for grouped frequency distribution assumes that all of the raw data values in each class are equal to the midpoint of the class. In reality, this is not true, since the average of the raw data values in each class will not be exactly equal to the midpoint

51
Example 5 However, using this procedure will give an acceptable approximation of the mean, since some values fall above the midpoint and some values fall below the midpoint for each class. Average amount of money spent for gasoline by family is $42 per month

54
Modification of the WAM formula If f is a relative frequency (SR, a share in the population is given), the classic formula of weighted arithmetic mean WAM is not applicable, we can use its modification:

66
Properties of the arithmetic mean If during the calculation of the WAM its properties have been used, we do not get a normal answer, that could be a transformed value. To get the normal WAM, you must make the reverse operations in reverse order

67
Simplified calculation of the arithmetic mean for a frequency distribution

68
is based on properties of the AM h – the interval width; c – one of the variants near the middle of the distribution (lying in the middle); А – a digital, a maximal multiple for all frequencies

71
Example 7 The following data represent the grouping of workers by size of payment: ClassesNumber of workers, More than Total:100 Find the average size of payment, using the mathematical properties of the mean Chap 3-70

72
Example 7 Chap 3-71 Transform the grouped frequency distribution into ungrouped frequency distribution calculating the midpoint (column 3). After that, using the property #4, each item could be reduced by the same constant A. Digital A can be anyone, however it is recommended to accept it equal to the variant with maximal frequency (column 4)

73
Example 7 Chap 3-72 On the basis of the property #3, reduced variants should be divided by the constant B, which can be anyone too. But it is recommended to accept В equal to the width of classes (column 5). The next table shows the procedure of solving the problem

75
Example 7 We have now new variants, which can be symbolized Since changing of the variants involved changing of the mean, it is necessary to get back the actual magnitude of the mean. Using the property #5, all frequencies were divided by k=5, because 5 is the maximal multiple (column 6)

76
Example 7 New frequencies are symbolized According to the property #5, changing of frequencies does not entail changing of the mean. Thus the formula of the mean will look as follows, using the mathematical properties: The average wage is $820 per worker.

78
Arithmetic Mean The arithmetic mean, as a single number representing a whole data set, has important advantages. First, its concept is familiar to most people and intuitively clear. Second, every data set has a mean. It is a measure that can be calculated, and it is unique because every data set has one and only one mean. Finally, the mean is useful for performing statistical procedure such as comparing the means from several data sets

79
Harmonic Mean HM The harmonic mean HM is defined as the number of values divided by the sum of the reciprocals of each value. The equations of the HM are shown below: simple harmonic mean:

81
Harmonic Mean HM HM is the reciprocal of the arithmetic mean: one can get the arithmetic mean as number 1, divided by the harmonic mean and vice versa. There are simple and weighted HMs. The weighted formula of HM is used more often

82
Harmonic Mean HM In the case of HM, frequencies are not known, but we know the total sum of the values. In fact, the arithmetic mean and the harmonic mean are applied in the same cases, but under different data sets. And so, before the choice of the mean equation it is necessary to construct the logical (economic) formula

83
HM is applied when the volumes of investigated variants are used as weights. Sometimes the problem arises: what formula should be used - the harmonic mean HM or arithmetic mean AM? The answer is as follows: Fits the formula, in which both the numerator and denominator would have values with an economic meaning

84
AM or HM ? The tip: If the initial information gives an averaged value (variant) and the denominator of the logical formula, the AM is used. If variant and the numerator of the logical formula are given, the HM is implemented

85
AM or HM ? In other words: If the numerator of the IRA is unknown, well use AM. If the denominator of the IRA is unknown, the HM should be used

90
Example 10 A carpenter buys $500 worth of nails at $50 per pound and $500 worth of nails at $10 per pound. Find the mean (average price of a pound of nails). For task solution we have to define how the average price can be found. Construct the logical formula expressing the relation between price, and worth:

91
Example 10 In this case, the total worth is known, but the number of pounds is not known. Therefore, the number of pounds could be counted by the ratio between total worth and price: The average price per pound of nails is $ The logical equation allows us to correctly choose the mean equation not breaking the relation of economic processes

93
Example 11 First, construct the logical formula, describing the relation between salary, wages fund, and the number of personnel:

94
Example 11 For the base month Wages fund is not known, but the number of personnel is known. Wages fund is the total sum of values Wi and could be defined by multiplication between salary per person (variants xi) and the number of personnel (frequencies fi). Thus, logical formula will be as follows:

95
Example 11 Using this logical formula, the average salary for the base month could be calculated: To calculate the average salary the weighted arithmetic mean was used:

96
Example 11 For the reporting month, Wages fund is known, but the number of personnel is not known. The number of personnel could be expressed by dividing Wages fund by the salary per person. Thus, the logical formula has been transformed in the following way:

97
Example 11 Using this transformation of the initial logical formula, the average salary could be calculated on the basis of the weighted harmonic mean:

98
Once more AM or HM ? The accuracy of the choice of the mean equations is rule-based: If the numerator of the logical formula is unknown, the arithmetic mean should be used. If the denominator of the logical formula is unknown the harmonic mean should be used

99
The geometric mean GM Sometimes when we are dealing with quantities that change over a period of time, we need to know an average rate of change, such as an average growth rate over a period of several years. In such cases, the arithmetic mean AM and the harmonic mean HM are inappropriate, because they give wrong answers. What we need to find is the geometric mean GM. The geometric mean GM is defined as the n th root of the product of n values

100
The geometric mean GM The formula of the simple geometric mean GM is: The geometric mean is useful for finding the average of percentages, ratios, indexes or growth rates

101
Example 12 The growth rate of the Living Life Insurance Corporation for the part three years was 35%, 24%, and 18%. Find the average growth rate. First, it is necessary to transform the growth rate into growth factor, using the following equation:

102
Example 12 Further the geometric mean can be applied: The average growth rate is 25.47% per year

103
The geometric mean GM The geometric mean could be applied when the data do to have large spread. Let us suppose, you want to calculate the average winning amount between maximal and minimal winnings amounts. n of the geometric mean is more justified:

104
Example 13 The geometric mean could be applied when the data do to have large spread. Let us suppose, you want to calculate the average winning amount between maximal and minimal winnings amounts. Example 13. Minimal winning amount $1000 and maximal winning amount $100,000. Find the average winning amount

105
Example 13 The raw data have big difference, and so, the application of the arithmetic mean will be incorrect. In this case, the application of the geometric mean is more justified: The average winning amount is $10,000

106
Weighted geometric mean WGM For analysis of time series the weighted geometric mean could be used. The formula is shown below: The application of the weighted geometric mean will be needed while solving problems on time series

107
The quadratic (square) mean QM A useful mean for physical sciences is the quadratic mean, which it found by taking the square root of the sum of the average of the squares of each value. There arc two kinds of the quadratic mean: the simple quadratic mean and the weighted quadratic mean: the simple quadratic mean;

108
The quadratic (square) mean QM the weighted quadratic mean. The quadratic mean is applied when the measures of dispersion are calculated

109
Chronological mean TM This mean formula is applied to the number of instant indicators, especially in time series:

110
Chronological mean TM Take half of the first and last values, plus all values that are in the middle of the series, the amount received divide by the number of moment indicators minus 1

111
Chronological mean TM TM is widely used in time series analysis, in socio-economic statistics to determine the average population and average size of the fund, as well as other indicators, calculated at certain points in time

112
Chronological mean TM If calculating the average for two moment indicators, the formula for chronological mean TM transforms into the formula of simple arithmetic mean AM:

113
The universal set of rules for calculating the average I offer universal set of rules for calculating the average, which discipline students and allow them to choose correctly the necessary mean. I.Write down a logical formula IRA of the average calculated. Remember that the logical formula does not depend on the initial data, so it is unique, and is appropriate only for the calculation of the required average

114
The universal set of rules for calculating the average II. Compare the logical formula with the initial data. There may be eight cases. 1. The numerator A is unknown, the data are not grouped – the formula of the simple arithmetic mean AM is used. 2. The numerator A is unknown, the data are grouped – use the weighted arithmetic mean WAM formula. If you know the relative frequency in the form of shares, the modified formula MAM of the weighted arithmetic mean is the best way for calculations

115
The universal set of rules for calculating the average 3. The denominator B is unknown, the data are not grouped – the simple harmonic mean HM is most suitable formula to calculate the average. 4. The denominator B is unknown, the data are grouped and weights W i are different – the weighted harmonic mean WHM should be chosen; if the weights W i coincide we recommend to apply Case #3

116
The universal set of rules for calculating the average 5. The data are incomplete, insufficient and not grouped – the simple geometric mean GM formula is necessary and sufficient. 6. The data are incomplete, insufficient, and grouped – deploy the formula of weighted geometric mean WHM. 7. Time series with instant levels and equal time intervals – the simple chronological mean TM is recommended to be applied

117
The universal set of rules for calculating the average 8. Time series with instant levels and unequal time intervals - the weighted chronological mean WTM is suitable. III. Write the appropriate formula. IV. Plug the initial data into the formula and perform the necessary calculations. V. Specify the unit of measure in the result received and formulate the economic meaning of this numerical answer

118
Structural averages Using the average power for the analysis of the distribution is not enough. Structural averages are used for initial analysis of the distribution of units in the population

120
Mode - the value of the variant occurring in the population the largest number of times. In everyday life the word mode" actually has the opposite meaning as fashion Mode Mo

121
Mode is the most common variant of frequency distribution. For a discrete series this is the value, which corresponds to the highest frequency Mode Mo

122
The mode is the value that is repeated most often in the data set. A data set can have more than one mode or no mode at all. If we analyze a discrete series and there are several variants with the highest frequency (which is quite rare), then the mode is defined as the arithmetic average of all the modal variants

123
Mode Mo The mode is the value that is repeated most often in the data set. A data set can have more than one mode or no mode at all. The value that occurs most often in a data set is called the mode The mode can be defined only for ungrouped frequency distribution and grouped frequency distribution. The mode for ungrouped frequency distribution could be defined by sight, by definition, using the most frequency

125
Example 14 The number of books read by each of the 28 students in a literature class is given below:

126
Example 14 The number of books read by each of 28 students in a literature class is given below: Number of booksNumber of students, frequency The most frequency Total28 The most frequency is 12, it means the mode is 2 books. Thus, the most students have read only two books

127
The mode for grouped frequency distribution can be defined using the following equation. It is used for interval frequency distribution with equal interval widths : where x M о - lower boundary of the modal class; h М о - width of the modal class; f М о - frequency of the modal class; f М о-1 - frequency of the pre-modal class; f М о+1 - frequency of the after-modal class

130
Finding the modal interval First, it is necessary to define the modal class by definition, using the most frequency. Therefore, the most frequency is 25 and it conforms to the class which is detected as the modal interval

131
Example 7 The following data represent the grouping of workers by size of payment: The size of payment, USDNumber of workers,% More than Total:100 Find the mode Chap 3-130

132
Thus, the majority of workers have the salary in the amount of $833.3

133
Mode Mo If the modal interval of the first or the last, the missing frequency (or pre-modal or after-modal) is taken to be zero

134
The mode on a graph The mode can be defined using the histogram. For this, it is necessary to select the highest bar, and then connect its right-wing angle with right- wing angle of previous bar. Further, connect left- wing angle of the highest bar with left-wing angle of the next bar. From the point of intersection of two segments, drop a perpendicular on the X-line. The point of intersection of perpendicular and abscissa axis is called the mode

138
Median The median is the measure of central tendency different from any of means. The median is a single value from the data set that measures the central item in the data. This single item is the middlemost or most central item in the set of numbers. Half of the items lie above this point, and the other half lie below it. The median is the midpoint of the data array For calculating the median the data must be ascended or descended by order

139
Median Me Me is the central, middle value of a population. Me – the value of a variant located in the middle of the ordered list. Me is the variant, which lies in the middle of the frequency distribution and divides it into two equal parts. In the discrete list Me is determined by definition, in the interval frequency distribution – by the formula

141
Finding the Median If a discrete list contains an odd number of values, Me is the only value, to the right and the left of which there is the same number of values:

142
Example 15 Find the median for the ages of seven preschool children. The ages are 2, 3, 4, 2, 3, 5 and 5. First it is necessary to ascend the data: 2, 2, 3, 3, 4, 5, 5. According to the equation, the median is the (7+l)/2=4 th item in the array and conforms to 3 years. Thus, you can say, half of the children are under 3 and the other half of them are over 3

143
Median For ungrouped data the median can be defined in the following way. If the data set contains an odd number of items, the middle item of the array is the median. If there is an even number of items, the median is the average of the two middle items and can be calculated using the equation from the next slide

144
Finding the Median If a discrete list contains an even number of values, there are two values, to the right and to the left of which there is the same number of values. Me is the arithmetic mean of these two values:

145
Example 16 The ages of ten college students are given below. Find the median: 18, 24, 20, 35, 19, 23, 26, 23, 19, 20. The data set contains the even number of items. Set the data in the ascending order: 18, 19, 19, 20, 20, 23, 23, 24, 26, 35. Using the equation, the median is the (10+1)/2=5.5 th item in the data set. In other words, the median lies between the 5 th and the 6 th items. Thus, the median is: Me= ( )/2= 21.5 years Therefore, half of the students are under 21.5 and the other half of the students are over 21.5 years old

146
Finding the Median For ungrouped frequency distribution the median Me can be defined using the cumulative frequencies

147
The golden rule Для дискретного ряда медианой является та варианта, для которой накопленная частота впервые превышает половину от суммы частот For the discrete frequency distribution the median is the value, for which the cumulative frequency for the first time is more than the half of the total frequencies

151
For grouped frequency distribution the median can be defined using the following equation: where x Ме - lower boundary of the median class; h Ме - width of the median class; f Ме - frequency of the median class; S Ме-1 - cumulative frequencies of the class immediately preceding the median class; - the sum of all frequencies (the size of population, the size of sampling)

152
Finding the Median The median class can be defined following the definition and the Golden Rule: we use the cumulative frequencies, in the median interval they are for the first time bigger than the ½ of the sum of all frequencies

154
This means that half of workers have labor productivity which is less than m, while the other half has productivity more than m

155
Example 7 The following data represent the grouping of workers by size of payment: The size of payment, USDNumber of workers,% Cumulative frequencies, Si More than Total:100 Find the median Chap 3-154

156
Example 7 First, it is necessary to divide the sum of all frequencies by 2 to find the halfway point: 100/2=50. Further, let us find the class that contains the 50 th value. This class is called the median class and it contains the median. The median class is $

157
Thus, half of workers have the size of payment less than $820 and the other half of workers have the size of payment more than $820

158
The median can be defined using the ogive. For this, it is recommended to select the point on the Y-line conforming to ½ of all frequencies. From this point the parallel to X-line should be drawn. From the point of intersection of parallel and ogive it is necessary to drop a perpendicular on the abscissa axis. The point of intersection of perpendicular and X-line is named the median. The next slide shows this procedure with the help of cumulative frequency graph